Dataset statistics
| Number of variables | 24 |
|---|---|
| Number of observations | 45466 |
| Missing cells | 105562 |
| Missing cells (%) | 9.7% |
| Total size in memory | 79.6 MiB |
| Average record size in memory | 1.8 KiB |
Variable types
| Text | 18 |
|---|---|
| Unsupported | 1 |
| Numeric | 4 |
| Boolean | 1 |
video is highly imbalanced (97.9%) | Imbalance |
belongs_to_collection has 40972 (90.1%) missing values | Missing |
homepage has 37684 (82.9%) missing values | Missing |
overview has 954 (2.1%) missing values | Missing |
tagline has 25054 (55.1%) missing values | Missing |
popularity is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
revenue has 38052 (83.7%) zeros | Zeros |
runtime has 1558 (3.4%) zeros | Zeros |
vote_average has 2998 (6.6%) zeros | Zeros |
vote_count has 2899 (6.4%) zeros | Zeros |
Reproduction
| Analysis started | 2024-04-26 14:46:51.966536 |
|---|---|
| Analysis finished | 2024-04-26 14:46:59.814800 |
| Duration | 7.85 seconds |
| Software version | ydata-profiling vv4.7.0 |
| Download configuration | config.json |
adult
Text
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.7 MiB |
Length
| Max length | 126 |
|---|---|
| Median length | 5 |
| Mean length | 5.00508072 |
| Min length | 4 |
Characters and Unicode
| Total characters | 227561 |
|---|---|
| Distinct characters | 34 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 2 ? |
Unique
| Unique | 3 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | False |
|---|---|
| 2nd row | False |
| 3rd row | False |
| 4th row | False |
| 5th row | False |
| Value | Count | Frequency (%) |
| false | 45454 | |
| true | 9 | < 0.1% |
| to | 4 | < 0.1% |
| a | 4 | < 0.1% |
| the | 2 | < 0.1% |
| avalanche | 2 | < 0.1% |
| by | 2 | < 0.1% |
| when | 1 | < 0.1% |
| contest | 1 | < 0.1% |
| hit | 1 | < 0.1% |
| Other values (32) | 32 | 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 45479 | |
| a | 45475 | |
| s | 45465 | |
| l | 45461 | |
| F | 45454 | |
| 49 | < 0.1% | |
| r | 25 | < 0.1% |
| t | 23 | < 0.1% |
| o | 19 | < 0.1% |
| n | 17 | < 0.1% |
| Other values (24) | 94 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 182039 | |
| Uppercase Letter | 45470 | 20.0% |
| Space Separator | 49 | < 0.1% |
| Other Punctuation | 2 | < 0.1% |
| Dash Punctuation | 1 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 45479 | |
| a | 45475 | |
| s | 45465 | |
| l | 45461 | |
| r | 25 | < 0.1% |
| t | 23 | < 0.1% |
| o | 19 | < 0.1% |
| n | 17 | < 0.1% |
| i | 13 | < 0.1% |
| u | 12 | < 0.1% |
| Other values (12) | 50 | < 0.1% |
Uppercase Letter
| Value | Count | Frequency (%) |
| F | 45454 | |
| T | 9 | < 0.1% |
| B | 1 | < 0.1% |
| R | 1 | < 0.1% |
| Ø | 1 | < 0.1% |
| O | 1 | < 0.1% |
| W | 1 | < 0.1% |
| A | 1 | < 0.1% |
| S | 1 | < 0.1% |
Space Separator
| Value | Count | Frequency (%) |
| 49 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 2 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 1 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 227509 | |
| Common | 52 | < 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 45479 | |
| a | 45475 | |
| s | 45465 | |
| l | 45461 | |
| F | 45454 | |
| r | 25 | < 0.1% |
| t | 23 | < 0.1% |
| o | 19 | < 0.1% |
| n | 17 | < 0.1% |
| i | 13 | < 0.1% |
| Other values (21) | 78 | < 0.1% |
Common
| Value | Count | Frequency (%) |
| 49 | ||
| . | 2 | 3.8% |
| - | 1 | 1.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 227559 | |
| None | 2 | < 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 45479 | |
| a | 45475 | |
| s | 45465 | |
| l | 45461 | |
| F | 45454 | |
| 49 | < 0.1% | |
| r | 25 | < 0.1% |
| t | 23 | < 0.1% |
| o | 19 | < 0.1% |
| n | 17 | < 0.1% |
| Other values (22) | 92 | < 0.1% |
None
| Value | Count | Frequency (%) |
| Ø | 1 | |
| å | 1 |
MISSING 
| Distinct | 1698 |
|---|---|
| Distinct (%) | 37.8% |
| Missing | 40972 |
| Missing (%) | 90.1% |
| Memory size | 2.1 MiB |
Length
| Max length | 184 |
|---|---|
| Median length | 167 |
| Mean length | 141.4063195 |
| Min length | 8 |
Characters and Unicode
| Total characters | 635480 |
|---|---|
| Distinct characters | 170 |
| Distinct categories | 13 ? |
| Distinct scripts | 7 ? |
| Distinct blocks | 8 ? |
Unique
| Unique | 393 ? |
|---|---|
| Unique (%) | 8.7% |
Sample
| 1st row | {'id': 10194, 'name': 'Toy Story Collection', 'poster_path': '/7G9915LfUQ2lVfwMEEhDsn3kT4B.jpg', 'backdrop_path': '/9FBwqcd9IRruEDUrTdcaafOMKUq.jpg'} |
|---|---|
| 2nd row | {'id': 119050, 'name': 'Grumpy Old Men Collection', 'poster_path': '/nLvUdqgPgm3F85NMCii9gVFUcet.jpg', 'backdrop_path': '/hypTnLot2z8wpFS7qwsQHW1uV8u.jpg'} |
| 3rd row | {'id': 96871, 'name': 'Father of the Bride Collection', 'poster_path': '/nts4iOmNnq7GNicycMJ9pSAn204.jpg', 'backdrop_path': '/7qwE57OVZmMJChBpLEbJEmzUydk.jpg'} |
| 4th row | {'id': 645, 'name': 'James Bond Collection', 'poster_path': '/HORpg5CSkmeQlAolx3bKMrKgfi.jpg', 'backdrop_path': '/6VcVl48kNKvdXOZfJPdarlUGOsk.jpg'} |
| 5th row | {'id': 117693, 'name': 'Balto Collection', 'poster_path': '/w0ZgH6Lgxt2bQYnf1ss74UvYftm.jpg', 'backdrop_path': '/9VM5LiJV0bGb1st1KyHA3cVnO2G.jpg'} |
| Value | Count | Frequency (%) |
| name | 4497 | 9.7% |
| id | 4491 | 9.7% |
| backdrop_path | 4491 | 9.7% |
| poster_path | 4491 | 9.7% |
| collection | 3746 | 8.1% |
| none | 1771 | 3.8% |
| the | 1146 | 2.5% |
| of | 230 | 0.5% |
| series | 147 | 0.3% |
| 139 | 0.3% | |
| Other values (6634) | 21083 |
Most occurring characters
| Value | Count | Frequency (%) |
| ' | 59225 | 9.3% |
| 41739 | 6.6% | |
| p | 29081 | 4.6% |
| a | 25710 | 4.0% |
| o | 25040 | 3.9% |
| e | 24229 | 3.8% |
| t | 23203 | 3.7% |
| : | 18063 | 2.8% |
| n | 16731 | 2.6% |
| r | 15825 | 2.5% |
| Other values (160) | 356634 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 317110 | |
| Other Punctuation | 105770 | 16.6% |
| Uppercase Letter | 95037 | 15.0% |
| Decimal Number | 56977 | 9.0% |
| Space Separator | 41739 | 6.6% |
| Connector Punctuation | 8982 | 1.4% |
| Open Punctuation | 4826 | 0.8% |
| Close Punctuation | 4826 | 0.8% |
| Dash Punctuation | 162 | < 0.1% |
| Other Letter | 37 | < 0.1% |
| Other values (3) | 14 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| p | 29081 | 9.2% |
| a | 25710 | 8.1% |
| o | 25040 | 7.9% |
| e | 24229 | 7.6% |
| t | 23203 | 7.3% |
| n | 16731 | 5.3% |
| r | 15825 | 5.0% |
| i | 15334 | 4.8% |
| h | 14439 | 4.6% |
| d | 13705 | 4.3% |
| Other values (69) | 113813 |
Uppercase Letter
| Value | Count | Frequency (%) |
| C | 7696 | 8.1% |
| N | 5094 | 5.4% |
| T | 4597 | 4.8% |
| S | 4189 | 4.4% |
| A | 3722 | 3.9% |
| M | 3699 | 3.9% |
| D | 3683 | 3.9% |
| B | 3680 | 3.9% |
| L | 3482 | 3.7% |
| G | 3461 | 3.6% |
| Other values (33) | 51734 |
Other Letter
| Value | Count | Frequency (%) |
| リ | 3 | 8.1% |
| い | 3 | 8.1% |
| 男 | 3 | 8.1% |
| は | 3 | 8.1% |
| つ | 3 | 8.1% |
| ら | 3 | 8.1% |
| よ | 3 | 8.1% |
| ズ | 3 | 8.1% |
| シ | 3 | 8.1% |
| 리 | 2 | 5.4% |
| Other values (4) | 8 |
Other Punctuation
| Value | Count | Frequency (%) |
| ' | 59225 | |
| : | 18063 | 17.1% |
| , | 13552 | 12.8% |
| . | 7386 | 7.0% |
| / | 7232 | 6.8% |
| " | 214 | 0.2% |
| & | 52 | < 0.1% |
| ! | 35 | < 0.1% |
| * | 4 | < 0.1% |
| ? | 4 | < 0.1% |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 6794 | |
| 2 | 6109 | |
| 3 | 5875 | |
| 4 | 5783 | |
| 5 | 5706 | |
| 9 | 5483 | |
| 8 | 5454 | |
| 6 | 5372 | |
| 7 | 5352 | |
| 0 | 5049 |
Open Punctuation
| Value | Count | Frequency (%) |
| { | 4491 | |
| ( | 330 | 6.8% |
| [ | 5 | 0.1% |
Close Punctuation
| Value | Count | Frequency (%) |
| } | 4491 | |
| ) | 330 | 6.8% |
| ] | 5 | 0.1% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 160 | |
| – | 2 | 1.2% |
Space Separator
| Value | Count | Frequency (%) |
| 41739 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 8982 |
Final Punctuation
| Value | Count | Frequency (%) |
| ’ | 9 |
Modifier Letter
| Value | Count | Frequency (%) |
| ー | 3 |
Other Number
| Value | Count | Frequency (%) |
| ½ | 2 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 411733 | |
| Common | 223296 | |
| Cyrillic | 414 | 0.1% |
| Hiragana | 15 | < 0.1% |
| Hangul | 10 | < 0.1% |
| Katakana | 9 | < 0.1% |
| Han | 3 | < 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| p | 29081 | 7.1% |
| a | 25710 | 6.2% |
| o | 25040 | 6.1% |
| e | 24229 | 5.9% |
| t | 23203 | 5.6% |
| n | 16731 | 4.1% |
| r | 15825 | 3.8% |
| i | 15334 | 3.7% |
| h | 14439 | 3.5% |
| d | 13705 | 3.3% |
| Other values (70) | 208436 |
Cyrillic
| Value | Count | Frequency (%) |
| л | 48 | 11.6% |
| и | 41 | 9.9% |
| о | 37 | 8.9% |
| к | 30 | 7.2% |
| е | 27 | 6.5% |
| я | 25 | 6.0% |
| а | 17 | 4.1% |
| К | 16 | 3.9% |
| ц | 16 | 3.9% |
| р | 14 | 3.4% |
| Other values (32) | 143 |
Common
| Value | Count | Frequency (%) |
| ' | 59225 | |
| 41739 | ||
| : | 18063 | 8.1% |
| , | 13552 | 6.1% |
| _ | 8982 | 4.0% |
| . | 7386 | 3.3% |
| / | 7232 | 3.2% |
| 1 | 6794 | 3.0% |
| 2 | 6109 | 2.7% |
| 3 | 5875 | 2.6% |
| Other values (24) | 48339 |
Hiragana
| Value | Count | Frequency (%) |
| い | 3 | |
| は | 3 | |
| つ | 3 | |
| ら | 3 | |
| よ | 3 |
Hangul
| Value | Count | Frequency (%) |
| 리 | 2 | |
| 즈 | 2 | |
| 객 | 2 | |
| 시 | 2 | |
| 식 | 2 |
Katakana
| Value | Count | Frequency (%) |
| リ | 3 | |
| ズ | 3 | |
| シ | 3 |
Han
| Value | Count | Frequency (%) |
| 男 | 3 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 634766 | |
| Cyrillic | 414 | 0.1% |
| None | 246 | < 0.1% |
| Hiragana | 15 | < 0.1% |
| Punctuation | 14 | < 0.1% |
| Katakana | 12 | < 0.1% |
| Hangul | 10 | < 0.1% |
| CJK | 3 | < 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| ' | 59225 | 9.3% |
| 41739 | 6.6% | |
| p | 29081 | 4.6% |
| a | 25710 | 4.1% |
| o | 25040 | 3.9% |
| e | 24229 | 3.8% |
| t | 23203 | 3.7% |
| : | 18063 | 2.8% |
| n | 16731 | 2.6% |
| r | 15825 | 2.5% |
| Other values (71) | 355920 |
Cyrillic
| Value | Count | Frequency (%) |
| л | 48 | 11.6% |
| и | 41 | 9.9% |
| о | 37 | 8.9% |
| к | 30 | 7.2% |
| е | 27 | 6.5% |
| я | 25 | 6.0% |
| а | 17 | 4.1% |
| К | 16 | 3.9% |
| ц | 16 | 3.9% |
| р | 14 | 3.4% |
| Other values (32) | 143 |
None
| Value | Count | Frequency (%) |
| é | 45 | |
| ä | 40 | |
| ô | 35 | |
| ò | 28 | |
| ö | 19 | |
| ó | 14 | 5.7% |
| ı | 14 | 5.7% |
| í | 9 | 3.7% |
| á | 4 | 1.6% |
| İ | 4 | 1.6% |
| Other values (19) | 34 |
Punctuation
| Value | Count | Frequency (%) |
| ’ | 9 | |
| … | 3 | 21.4% |
| – | 2 | 14.3% |
Katakana
| Value | Count | Frequency (%) |
| リ | 3 | |
| ー | 3 | |
| ズ | 3 | |
| シ | 3 |
Hiragana
| Value | Count | Frequency (%) |
| い | 3 | |
| は | 3 | |
| つ | 3 | |
| ら | 3 | |
| よ | 3 |
CJK
| Value | Count | Frequency (%) |
| 男 | 3 |
Hangul
| Value | Count | Frequency (%) |
| 리 | 2 | |
| 즈 | 2 | |
| 객 | 2 | |
| 시 | 2 | |
| 식 | 2 |
budget
Text
| Distinct | 1226 |
|---|---|
| Distinct (%) | 2.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.6 MiB |
Length
| Max length | 32 |
|---|---|
| Median length | 1 |
| Mean length | 2.215391721 |
| Min length | 1 |
Characters and Unicode
| Total characters | 100725 |
|---|---|
| Distinct characters | 49 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 839 ? |
|---|---|
| Unique (%) | 1.8% |
Sample
| 1st row | 30000000 |
|---|---|
| 2nd row | 65000000 |
| 3rd row | 0 |
| 4th row | 16000000 |
| 5th row | 0 |
| Value | Count | Frequency (%) |
| 0 | 36573 | |
| 5000000 | 286 | 0.6% |
| 10000000 | 259 | 0.6% |
| 20000000 | 243 | 0.5% |
| 2000000 | 242 | 0.5% |
| 15000000 | 226 | 0.5% |
| 3000000 | 223 | 0.5% |
| 25000000 | 206 | 0.5% |
| 1000000 | 197 | 0.4% |
| 30000000 | 190 | 0.4% |
| Other values (1216) | 6821 | 15.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 84525 | |
| 1 | 3222 | 3.2% |
| 5 | 3201 | 3.2% |
| 2 | 2555 | 2.5% |
| 3 | 1792 | 1.8% |
| 4 | 1325 | 1.3% |
| 6 | 1147 | 1.1% |
| 7 | 1119 | 1.1% |
| 8 | 1102 | 1.1% |
| 9 | 660 | 0.7% |
| Other values (39) | 77 | 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 100648 | |
| Lowercase Letter | 46 | < 0.1% |
| Uppercase Letter | 25 | < 0.1% |
| Other Punctuation | 6 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| g | 5 | 10.9% |
| j | 4 | 8.7% |
| z | 4 | 8.7% |
| p | 4 | 8.7% |
| o | 3 | 6.5% |
| b | 3 | 6.5% |
| f | 3 | 6.5% |
| w | 2 | 4.3% |
| s | 2 | 4.3% |
| q | 2 | 4.3% |
| Other values (12) | 14 |
Uppercase Letter
| Value | Count | Frequency (%) |
| W | 3 | |
| G | 3 | |
| F | 2 | 8.0% |
| X | 2 | 8.0% |
| V | 2 | 8.0% |
| S | 2 | 8.0% |
| L | 2 | 8.0% |
| D | 2 | 8.0% |
| R | 1 | 4.0% |
| H | 1 | 4.0% |
| Other values (5) | 5 |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 84525 | |
| 1 | 3222 | 3.2% |
| 5 | 3201 | 3.2% |
| 2 | 2555 | 2.5% |
| 3 | 1792 | 1.8% |
| 4 | 1325 | 1.3% |
| 6 | 1147 | 1.1% |
| 7 | 1119 | 1.1% |
| 8 | 1102 | 1.1% |
| 9 | 660 | 0.7% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 3 | |
| / | 3 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 100654 | |
| Latin | 71 | 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| g | 5 | 7.0% |
| j | 4 | 5.6% |
| z | 4 | 5.6% |
| p | 4 | 5.6% |
| o | 3 | 4.2% |
| b | 3 | 4.2% |
| W | 3 | 4.2% |
| G | 3 | 4.2% |
| f | 3 | 4.2% |
| w | 2 | 2.8% |
| Other values (27) | 37 |
Common
| Value | Count | Frequency (%) |
| 0 | 84525 | |
| 1 | 3222 | 3.2% |
| 5 | 3201 | 3.2% |
| 2 | 2555 | 2.5% |
| 3 | 1792 | 1.8% |
| 4 | 1325 | 1.3% |
| 6 | 1147 | 1.1% |
| 7 | 1119 | 1.1% |
| 8 | 1102 | 1.1% |
| 9 | 660 | 0.7% |
| Other values (2) | 6 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 100725 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 84525 | |
| 1 | 3222 | 3.2% |
| 5 | 3201 | 3.2% |
| 2 | 2555 | 2.5% |
| 3 | 1792 | 1.8% |
| 4 | 1325 | 1.3% |
| 6 | 1147 | 1.1% |
| 7 | 1119 | 1.1% |
| 8 | 1102 | 1.1% |
| 9 | 660 | 0.7% |
| Other values (39) | 77 | 0.1% |
genres
Text
| Distinct | 4069 |
|---|---|
| Distinct (%) | 8.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.2 MiB |
Length
| Max length | 264 |
|---|---|
| Median length | 225 |
| Mean length | 62.82213082 |
| Min length | 2 |
Characters and Unicode
| Total characters | 2856271 |
|---|---|
| Distinct characters | 56 |
| Distinct categories | 7 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 2365 ? |
|---|---|
| Unique (%) | 5.2% |
Sample
| 1st row | [{'id': 16, 'name': 'Animation'}, {'id': 35, 'name': 'Comedy'}, {'id': 10751, 'name': 'Family'}] |
|---|---|
| 2nd row | [{'id': 12, 'name': 'Adventure'}, {'id': 14, 'name': 'Fantasy'}, {'id': 10751, 'name': 'Family'}] |
| 3rd row | [{'id': 10749, 'name': 'Romance'}, {'id': 35, 'name': 'Comedy'}] |
| 4th row | [{'id': 35, 'name': 'Comedy'}, {'id': 18, 'name': 'Drama'}, {'id': 10749, 'name': 'Romance'}] |
| 5th row | [{'id': 35, 'name': 'Comedy'}] |
| Value | Count | Frequency (%) |
| id | 91106 | |
| name | 91106 | |
| drama | 20265 | 5.5% |
| 18 | 20265 | 5.5% |
| 35 | 13182 | 3.6% |
| comedy | 13182 | 3.6% |
| 53 | 7624 | 2.1% |
| thriller | 7624 | 2.1% |
| romance | 6735 | 1.8% |
| 10749 | 6735 | 1.8% |
| Other values (71) | 92873 |
Most occurring characters
| Value | Count | Frequency (%) |
| ' | 546636 | |
| 325231 | 11.4% | |
| : | 182212 | 6.4% |
| a | 152966 | 5.4% |
| e | 146936 | 5.1% |
| m | 144238 | 5.0% |
| , | 139188 | 4.9% |
| i | 130819 | 4.6% |
| n | 126822 | 4.4% |
| d | 107792 | 3.8% |
| Other values (46) | 853431 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 1059477 | |
| Other Punctuation | 868036 | |
| Space Separator | 325231 | 11.4% |
| Decimal Number | 234672 | 8.2% |
| Close Punctuation | 136572 | 4.8% |
| Open Punctuation | 136572 | 4.8% |
| Uppercase Letter | 95711 | 3.4% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 152966 | |
| e | 146936 | |
| m | 144238 | |
| i | 130819 | |
| n | 126822 | |
| d | 107792 | |
| r | 69131 | |
| o | 48578 | 4.6% |
| y | 28531 | 2.7% |
| c | 28015 | 2.6% |
| Other values (12) | 75649 |
Uppercase Letter
| Value | Count | Frequency (%) |
| D | 24197 | |
| C | 17492 | |
| A | 12029 | |
| F | 9756 | |
| T | 8395 | 8.8% |
| R | 6737 | 7.0% |
| H | 6072 | 6.3% |
| M | 4834 | 5.1% |
| S | 3053 | 3.2% |
| W | 2365 | 2.5% |
| Other values (6) | 781 | 0.8% |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 45609 | |
| 8 | 39739 | |
| 5 | 24901 | |
| 3 | 23251 | |
| 7 | 22757 | |
| 0 | 21491 | |
| 9 | 18690 | |
| 2 | 17694 | 7.5% |
| 4 | 13113 | 5.6% |
| 6 | 7427 | 3.2% |
Other Punctuation
| Value | Count | Frequency (%) |
| ' | 546636 | |
| : | 182212 | 21.0% |
| , | 139188 | 16.0% |
Close Punctuation
| Value | Count | Frequency (%) |
| } | 91106 | |
| ] | 45466 |
Open Punctuation
| Value | Count | Frequency (%) |
| { | 91106 | |
| [ | 45466 |
Space Separator
| Value | Count | Frequency (%) |
| 325231 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 1701083 | |
| Latin | 1155188 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 152966 | |
| e | 146936 | |
| m | 144238 | |
| i | 130819 | |
| n | 126822 | |
| d | 107792 | |
| r | 69131 | |
| o | 48578 | 4.2% |
| y | 28531 | 2.5% |
| c | 28015 | 2.4% |
| Other values (28) | 171360 |
Common
| Value | Count | Frequency (%) |
| ' | 546636 | |
| 325231 | ||
| : | 182212 | 10.7% |
| , | 139188 | 8.2% |
| } | 91106 | 5.4% |
| { | 91106 | 5.4% |
| 1 | 45609 | 2.7% |
| ] | 45466 | 2.7% |
| [ | 45466 | 2.7% |
| 8 | 39739 | 2.3% |
| Other values (8) | 149324 | 8.8% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2856271 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| ' | 546636 | |
| 325231 | 11.4% | |
| : | 182212 | 6.4% |
| a | 152966 | 5.4% |
| e | 146936 | 5.1% |
| m | 144238 | 5.0% |
| , | 139188 | 4.9% |
| i | 130819 | 4.6% |
| n | 126822 | 4.4% |
| d | 107792 | 3.8% |
| Other values (46) | 853431 |
homepage
Text
MISSING 
| Distinct | 7673 |
|---|---|
| Distinct (%) | 98.6% |
| Missing | 37684 |
| Missing (%) | 82.9% |
| Memory size | 1.8 MiB |
Length
| Max length | 242 |
|---|---|
| Median length | 110 |
| Mean length | 36.71279877 |
| Min length | 13 |
Characters and Unicode
| Total characters | 285699 |
|---|---|
| Distinct characters | 91 |
| Distinct categories | 12 ? |
| Distinct scripts | 3 ? |
| Distinct blocks | 4 ? |
Unique
| Unique | 7610 ? |
|---|---|
| Unique (%) | 97.8% |
Sample
| 1st row | http://toystory.disney.com/toy-story |
|---|---|
| 2nd row | http://www.mgm.com/view/movie/757/Goldeneye/ |
| 3rd row | http://www.mgm.com/title_title.do?title_star=LEAVINGL |
| 4th row | http://www.sevenmovie.com/ |
| 5th row | http://www.mgm.com/#/our-titles/2083/The-Usual-Suspects |
| Value | Count | Frequency (%) |
| http://www.georgecarlin.com | 12 | 0.2% |
| iso_3166_1 | 7 | 0.1% |
| name | 7 | 0.1% |
| http://www.wernerherzog.com/films-by.html | 7 | 0.1% |
| http://www.kungfupanda.com | 6 | 0.1% |
| http://breakblade.jp | 6 | 0.1% |
| http://www.missionimpossible.com | 5 | 0.1% |
| http://www.transformersmovie.com | 5 | 0.1% |
| http://www.thehungergames.movie | 4 | 0.1% |
| http://www.crownintlpictures.com/ostitles.html | 4 | 0.1% |
| Other values (7658) | 7753 |
Most occurring characters
| Value | Count | Frequency (%) |
| t | 25849 | 9.0% |
| / | 25820 | 9.0% |
| w | 19516 | 6.8% |
| o | 18783 | 6.6% |
| e | 18709 | 6.5% |
| . | 15387 | 5.4% |
| m | 15101 | 5.3% |
| h | 13863 | 4.9% |
| i | 13654 | 4.8% |
| c | 11414 | 4.0% |
| Other values (81) | 107603 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 225336 | |
| Other Punctuation | 49566 | 17.3% |
| Decimal Number | 4726 | 1.7% |
| Dash Punctuation | 3507 | 1.2% |
| Uppercase Letter | 1721 | 0.6% |
| Connector Punctuation | 471 | 0.2% |
| Math Symbol | 287 | 0.1% |
| Space Separator | 34 | < 0.1% |
| Open Punctuation | 24 | < 0.1% |
| Close Punctuation | 24 | < 0.1% |
| Other values (2) | 3 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| t | 25849 | 11.5% |
| w | 19516 | 8.7% |
| o | 18783 | 8.3% |
| e | 18709 | 8.3% |
| m | 15101 | 6.7% |
| h | 13863 | 6.2% |
| i | 13654 | 6.1% |
| c | 11414 | 5.1% |
| p | 11166 | 5.0% |
| a | 11155 | 5.0% |
| Other values (18) | 66126 |
Uppercase Letter
| Value | Count | Frequency (%) |
| M | 145 | 8.4% |
| T | 138 | 8.0% |
| S | 122 | 7.1% |
| A | 106 | 6.2% |
| F | 101 | 5.9% |
| B | 96 | 5.6% |
| E | 91 | 5.3% |
| D | 90 | 5.2% |
| I | 90 | 5.2% |
| C | 88 | 5.1% |
| Other values (16) | 654 |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 25820 | |
| . | 15387 | |
| : | 7806 | 15.7% |
| ? | 189 | 0.4% |
| % | 105 | 0.2% |
| # | 85 | 0.2% |
| & | 79 | 0.2% |
| ' | 60 | 0.1% |
| , | 17 | < 0.1% |
| ! | 14 | < 0.1% |
| Other values (2) | 4 | < 0.1% |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 853 | |
| 2 | 728 | |
| 1 | 727 | |
| 3 | 478 | |
| 9 | 349 | |
| 6 | 341 | 7.2% |
| 4 | 331 | 7.0% |
| 5 | 314 | 6.6% |
| 8 | 311 | 6.6% |
| 7 | 294 | 6.2% |
Math Symbol
| Value | Count | Frequency (%) |
| = | 271 | |
| + | 14 | 4.9% |
| ~ | 2 | 0.7% |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 13 | |
| { | 8 | |
| [ | 3 | 12.5% |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 13 | |
| } | 8 | |
| ] | 3 | 12.5% |
Other Letter
| Value | Count | Frequency (%) |
| 녀 | 1 | |
| 협 | 1 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 3507 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 471 |
Space Separator
| Value | Count | Frequency (%) |
| 34 |
Format
| Value | Count | Frequency (%) |
| | 1 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 227057 | |
| Common | 58640 | 20.5% |
| Hangul | 2 | < 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| t | 25849 | 11.4% |
| w | 19516 | 8.6% |
| o | 18783 | 8.3% |
| e | 18709 | 8.2% |
| m | 15101 | 6.7% |
| h | 13863 | 6.1% |
| i | 13654 | 6.0% |
| c | 11414 | 5.0% |
| p | 11166 | 4.9% |
| a | 11155 | 4.9% |
| Other values (44) | 67847 |
Common
| Value | Count | Frequency (%) |
| / | 25820 | |
| . | 15387 | |
| : | 7806 | 13.3% |
| - | 3507 | 6.0% |
| 0 | 853 | 1.5% |
| 2 | 728 | 1.2% |
| 1 | 727 | 1.2% |
| 3 | 478 | 0.8% |
| _ | 471 | 0.8% |
| 9 | 349 | 0.6% |
| Other values (25) | 2514 | 4.3% |
Hangul
| Value | Count | Frequency (%) |
| 녀 | 1 | |
| 협 | 1 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 285693 | |
| Hangul | 2 | < 0.1% |
| None | 2 | < 0.1% |
| Punctuation | 2 | < 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| t | 25849 | 9.0% |
| / | 25820 | 9.0% |
| w | 19516 | 6.8% |
| o | 18783 | 6.6% |
| e | 18709 | 6.5% |
| . | 15387 | 5.4% |
| m | 15101 | 5.3% |
| h | 13863 | 4.9% |
| i | 13654 | 4.8% |
| c | 11414 | 4.0% |
| Other values (75) | 107597 |
Hangul
| Value | Count | Frequency (%) |
| 녀 | 1 | |
| 협 | 1 |
None
| Value | Count | Frequency (%) |
| ñ | 1 | |
| ä | 1 |
Punctuation
| Value | Count | Frequency (%) |
| | 1 | |
| … | 1 |
id
Text
| Distinct | 45436 |
|---|---|
| Distinct (%) | 99.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.7 MiB |
Length
| Max length | 10 |
|---|---|
| Median length | 5 |
| Mean length | 5.251484626 |
| Min length | 1 |
Characters and Unicode
| Total characters | 238764 |
|---|---|
| Distinct characters | 11 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 45407 ? |
|---|---|
| Unique (%) | 99.9% |
Sample
| 1st row | 862 |
|---|---|
| 2nd row | 8844 |
| 3rd row | 15602 |
| 4th row | 31357 |
| 5th row | 11862 |
| Value | Count | Frequency (%) |
| 141971 | 3 | < 0.1% |
| 159849 | 2 | < 0.1% |
| 168538 | 2 | < 0.1% |
| 298721 | 2 | < 0.1% |
| 265189 | 2 | < 0.1% |
| 5511 | 2 | < 0.1% |
| 97995 | 2 | < 0.1% |
| 99080 | 2 | < 0.1% |
| 23305 | 2 | < 0.1% |
| 119916 | 2 | < 0.1% |
| Other values (45426) | 45445 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 32923 | |
| 2 | 28625 | |
| 3 | 26732 | |
| 4 | 24747 | |
| 5 | 21996 | |
| 6 | 21184 | |
| 7 | 20949 | |
| 8 | 20909 | |
| 9 | 20485 | |
| 0 | 20208 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 238758 | |
| Dash Punctuation | 6 | < 0.1% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 32923 | |
| 2 | 28625 | |
| 3 | 26732 | |
| 4 | 24747 | |
| 5 | 21996 | |
| 6 | 21184 | |
| 7 | 20949 | |
| 8 | 20909 | |
| 9 | 20485 | |
| 0 | 20208 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 6 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 238764 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 32923 | |
| 2 | 28625 | |
| 3 | 26732 | |
| 4 | 24747 | |
| 5 | 21996 | |
| 6 | 21184 | |
| 7 | 20949 | |
| 8 | 20909 | |
| 9 | 20485 | |
| 0 | 20208 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 238764 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 32923 | |
| 2 | 28625 | |
| 3 | 26732 | |
| 4 | 24747 | |
| 5 | 21996 | |
| 6 | 21184 | |
| 7 | 20949 | |
| 8 | 20909 | |
| 9 | 20485 | |
| 0 | 20208 |
imdb_id
Text
| Distinct | 45417 |
|---|---|
| Distinct (%) | 99.9% |
| Missing | 17 |
| Missing (%) | < 0.1% |
| Memory size | 2.9 MiB |
Length
| Max length | 9 |
|---|---|
| Median length | 9 |
| Mean length | 8.999471936 |
| Min length | 1 |
Characters and Unicode
| Total characters | 409017 |
|---|---|
| Distinct characters | 11 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 45387 ? |
|---|---|
| Unique (%) | 99.9% |
Sample
| 1st row | tt0114709 |
|---|---|
| 2nd row | tt0113497 |
| 3rd row | tt0113228 |
| 4th row | tt0114885 |
| 5th row | tt0113041 |
| Value | Count | Frequency (%) |
| tt1180333 | 3 | < 0.1% |
| 0 | 3 | < 0.1% |
| tt0046468 | 2 | < 0.1% |
| tt1327820 | 2 | < 0.1% |
| tt2818654 | 2 | < 0.1% |
| tt0111613 | 2 | < 0.1% |
| tt1821641 | 2 | < 0.1% |
| tt0127834 | 2 | < 0.1% |
| tt0295682 | 2 | < 0.1% |
| tt0080000 | 2 | < 0.1% |
| Other values (45407) | 45427 |
Most occurring characters
| Value | Count | Frequency (%) |
| t | 90892 | |
| 0 | 69913 | |
| 1 | 37232 | |
| 2 | 31234 | 7.6% |
| 4 | 28498 | 7.0% |
| 3 | 28135 | 6.9% |
| 8 | 25445 | 6.2% |
| 6 | 25442 | 6.2% |
| 5 | 24253 | 5.9% |
| 7 | 24221 | 5.9% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 318125 | |
| Lowercase Letter | 90892 | 22.2% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 69913 | |
| 1 | 37232 | |
| 2 | 31234 | |
| 4 | 28498 | |
| 3 | 28135 | |
| 8 | 25445 | 8.0% |
| 6 | 25442 | 8.0% |
| 5 | 24253 | 7.6% |
| 7 | 24221 | 7.6% |
| 9 | 23752 | 7.5% |
Lowercase Letter
| Value | Count | Frequency (%) |
| t | 90892 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 318125 | |
| Latin | 90892 | 22.2% |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 69913 | |
| 1 | 37232 | |
| 2 | 31234 | |
| 4 | 28498 | |
| 3 | 28135 | |
| 8 | 25445 | 8.0% |
| 6 | 25442 | 8.0% |
| 5 | 24253 | 7.6% |
| 7 | 24221 | 7.6% |
| 9 | 23752 | 7.5% |
Latin
| Value | Count | Frequency (%) |
| t | 90892 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 409017 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| t | 90892 | |
| 0 | 69913 | |
| 1 | 37232 | |
| 2 | 31234 | 7.6% |
| 4 | 28498 | 7.0% |
| 3 | 28135 | 6.9% |
| 8 | 25445 | 6.2% |
| 6 | 25442 | 6.2% |
| 5 | 24253 | 5.9% |
| 7 | 24221 | 5.9% |
| Distinct | 92 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 11 |
| Missing (%) | < 0.1% |
| Memory size | 2.6 MiB |
Length
| Max length | 5 |
|---|---|
| Median length | 2 |
| Mean length | 2.000153998 |
| Min length | 2 |
Characters and Unicode
| Total characters | 90917 |
|---|---|
| Distinct characters | 33 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 20 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | en |
|---|---|
| 2nd row | en |
| 3rd row | en |
| 4th row | en |
| 5th row | en |
| Value | Count | Frequency (%) |
| en | 32269 | |
| fr | 2438 | 5.4% |
| it | 1529 | 3.4% |
| ja | 1350 | 3.0% |
| de | 1080 | 2.4% |
| es | 994 | 2.2% |
| ru | 826 | 1.8% |
| hi | 508 | 1.1% |
| ko | 444 | 1.0% |
| zh | 409 | 0.9% |
| Other values (82) | 3608 | 7.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 34598 | |
| n | 32978 | |
| r | 3636 | 4.0% |
| f | 2839 | 3.1% |
| i | 2391 | 2.6% |
| t | 2252 | 2.5% |
| a | 1841 | 2.0% |
| s | 1654 | 1.8% |
| j | 1351 | 1.5% |
| d | 1325 | 1.5% |
| Other values (23) | 6052 | 6.7% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 90904 | |
| Decimal Number | 10 | < 0.1% |
| Other Punctuation | 3 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 34598 | |
| n | 32978 | |
| r | 3636 | 4.0% |
| f | 2839 | 3.1% |
| i | 2391 | 2.6% |
| t | 2252 | 2.5% |
| a | 1841 | 2.0% |
| s | 1654 | 1.8% |
| j | 1351 | 1.5% |
| d | 1325 | 1.5% |
| Other values (16) | 6039 | 6.6% |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 4 | |
| 8 | 2 | |
| 2 | 1 | 10.0% |
| 6 | 1 | 10.0% |
| 1 | 1 | 10.0% |
| 4 | 1 | 10.0% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 3 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 90904 | |
| Common | 13 | < 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 34598 | |
| n | 32978 | |
| r | 3636 | 4.0% |
| f | 2839 | 3.1% |
| i | 2391 | 2.6% |
| t | 2252 | 2.5% |
| a | 1841 | 2.0% |
| s | 1654 | 1.8% |
| j | 1351 | 1.5% |
| d | 1325 | 1.5% |
| Other values (16) | 6039 | 6.6% |
Common
| Value | Count | Frequency (%) |
| 0 | 4 | |
| . | 3 | |
| 8 | 2 | |
| 2 | 1 | 7.7% |
| 6 | 1 | 7.7% |
| 1 | 1 | 7.7% |
| 4 | 1 | 7.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 90917 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 34598 | |
| n | 32978 | |
| r | 3636 | 4.0% |
| f | 2839 | 3.1% |
| i | 2391 | 2.6% |
| t | 2252 | 2.5% |
| a | 1841 | 2.0% |
| s | 1654 | 1.8% |
| j | 1351 | 1.5% |
| d | 1325 | 1.5% |
| Other values (23) | 6052 | 6.7% |
original_title
Text
| Distinct | 43373 |
|---|---|
| Distinct (%) | 95.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.4 MiB |
Length
| Max length | 109 |
|---|---|
| Median length | 84 |
| Mean length | 16.32349448 |
| Min length | 1 |
Characters and Unicode
| Total characters | 742164 |
|---|---|
| Distinct characters | 2946 |
| Distinct categories | 21 ? |
| Distinct scripts | 21 ? |
| Distinct blocks | 29 ? |
Unique
| Unique | 41712 ? |
|---|---|
| Unique (%) | 91.7% |
Sample
| 1st row | Toy Story |
|---|---|
| 2nd row | Jumanji |
| 3rd row | Grumpier Old Men |
| 4th row | Waiting to Exhale |
| 5th row | Father of the Bride Part II |
| Value | Count | Frequency (%) |
| the | 10261 | 7.8% |
| of | 3309 | 2.5% |
| a | 1674 | 1.3% |
| in | 1275 | 1.0% |
| and | 1072 | 0.8% |
| la | 1007 | 0.8% |
| 863 | 0.7% | |
| to | 806 | 0.6% |
| de | 702 | 0.5% |
| man | 509 | 0.4% |
| Other values (35324) | 110301 |
Most occurring characters
| Value | Count | Frequency (%) |
| 86293 | 11.6% | |
| e | 70665 | 9.5% |
| a | 49100 | 6.6% |
| o | 42066 | 5.7% |
| i | 39494 | 5.3% |
| n | 39149 | 5.3% |
| r | 37728 | 5.1% |
| t | 33530 | 4.5% |
| s | 28615 | 3.9% |
| l | 25557 | 3.4% |
| Other values (2936) | 289967 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 521249 | |
| Uppercase Letter | 102496 | 13.8% |
| Space Separator | 86338 | 11.6% |
| Other Letter | 14792 | 2.0% |
| Other Punctuation | 10434 | 1.4% |
| Decimal Number | 3862 | 0.5% |
| Dash Punctuation | 1207 | 0.2% |
| Nonspacing Mark | 579 | 0.1% |
| Spacing Mark | 480 | 0.1% |
| Modifier Letter | 249 | < 0.1% |
| Other values (11) | 478 | 0.1% |
Most frequent character per category
Other Letter
| Value | Count | Frequency (%) |
| の | 341 | 2.3% |
| ン | 183 | 1.2% |
| ラ | 141 | 1.0% |
| ا | 117 | 0.8% |
| ス | 116 | 0.8% |
| イ | 112 | 0.8% |
| ル | 112 | 0.8% |
| い | 84 | 0.6% |
| ト | 84 | 0.6% |
| ッ | 76 | 0.5% |
| Other values (2400) | 13426 |
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 70665 | |
| a | 49100 | 9.4% |
| o | 42066 | 8.1% |
| i | 39494 | 7.6% |
| n | 39149 | 7.5% |
| r | 37728 | 7.2% |
| t | 33530 | 6.4% |
| s | 28615 | 5.5% |
| l | 25557 | 4.9% |
| h | 22886 | 4.4% |
| Other values (200) | 132459 |
Uppercase Letter
| Value | Count | Frequency (%) |
| T | 12191 | 11.9% |
| S | 8839 | 8.6% |
| M | 6898 | 6.7% |
| B | 6470 | 6.3% |
| L | 6176 | 6.0% |
| C | 6079 | 5.9% |
| A | 5849 | 5.7% |
| D | 5693 | 5.6% |
| H | 4443 | 4.3% |
| P | 4434 | 4.3% |
| Other values (121) | 35424 |
Nonspacing Mark
| Value | Count | Frequency (%) |
| ் | 90 | |
| ് | 58 | 10.0% |
| ् | 42 | 7.3% |
| े | 30 | 5.2% |
| ั | 28 | 4.8% |
| ్ | 25 | 4.3% |
| ่ | 25 | 4.3% |
| ้ | 21 | 3.6% |
| ี | 21 | 3.6% |
| ্ | 19 | 3.3% |
| Other values (34) | 220 |
Spacing Mark
| Value | Count | Frequency (%) |
| ा | 87 | |
| ா | 37 | 7.7% |
| ी | 37 | 7.7% |
| ு | 36 | 7.5% |
| া | 33 | 6.9% |
| ி | 33 | 6.9% |
| ि | 20 | 4.2% |
| ം | 19 | 4.0% |
| ो | 17 | 3.5% |
| ാ | 16 | 3.3% |
| Other values (25) | 145 |
Other Punctuation
| Value | Count | Frequency (%) |
| : | 3331 | |
| ' | 2614 | |
| . | 1690 | |
| , | 1133 | 10.9% |
| ! | 687 | 6.6% |
| & | 403 | 3.9% |
| ? | 253 | 2.4% |
| / | 84 | 0.8% |
| ・ | 75 | 0.7% |
| * | 20 | 0.2% |
| Other values (24) | 144 | 1.4% |
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 841 | |
| 1 | 688 | |
| 0 | 589 | |
| 3 | 480 | |
| 9 | 238 | 6.2% |
| 4 | 232 | 6.0% |
| 5 | 219 | 5.7% |
| 7 | 210 | 5.4% |
| 8 | 163 | 4.2% |
| 6 | 163 | 4.2% |
| Other values (14) | 39 | 1.0% |
Math Symbol
| Value | Count | Frequency (%) |
| + | 19 | |
| ~ | 16 | |
| × | 7 | 12.7% |
| = | 3 | 5.5% |
| ~ | 3 | 5.5% |
| < | 2 | 3.6% |
| > | 2 | 3.6% |
| + | 1 | 1.8% |
| ∞ | 1 | 1.8% |
| → | 1 | 1.8% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 1165 | |
| – | 32 | 2.7% |
| 〜 | 4 | 0.3% |
| — | 3 | 0.2% |
| ― | 2 | 0.2% |
| - | 1 | 0.1% |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 98 | |
| ] | 13 | 9.4% |
| 」 | 13 | 9.4% |
| ) | 8 | 5.8% |
| 〉 | 3 | 2.2% |
| } | 3 | 2.2% |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 96 | |
| 「 | 13 | 9.6% |
| [ | 13 | 9.6% |
| ( | 8 | 5.9% |
| 〈 | 3 | 2.2% |
| { | 3 | 2.2% |
Other Symbol
| Value | Count | Frequency (%) |
| ☆ | 10 | |
| ° | 7 | |
| ™ | 1 | 4.8% |
| ♡ | 1 | 4.8% |
| № | 1 | 4.8% |
| ★ | 1 | 4.8% |
Other Number
| Value | Count | Frequency (%) |
| ½ | 10 | |
| ² | 2 | 13.3% |
| ³ | 2 | 13.3% |
| ⅓ | 1 | 6.7% |
Final Punctuation
| Value | Count | Frequency (%) |
| ’ | 32 | |
| » | 5 | 13.2% |
| ” | 1 | 2.6% |
Currency Symbol
| Value | Count | Frequency (%) |
| $ | 18 | |
| ¢ | 2 | 9.5% |
| £ | 1 | 4.8% |
Format
| Value | Count | Frequency (%) |
| | 15 | |
| | 14 | |
| | 5 | 14.7% |
Initial Punctuation
| Value | Count | Frequency (%) |
| « | 5 | |
| ‘ | 1 | 14.3% |
| “ | 1 | 14.3% |
Letter Number
| Value | Count | Frequency (%) |
| Ⅱ | 2 | |
| Ⅲ | 1 | |
| Ⅰ | 1 |
Space Separator
| Value | Count | Frequency (%) |
| 86293 | ||
| 45 | 0.1% |
Modifier Letter
| Value | Count | Frequency (%) |
| ー | 245 | |
| 々 | 4 | 1.6% |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 9 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 612843 | |
| Common | 102531 | 13.8% |
| Cyrillic | 9023 | 1.2% |
| Han | 5713 | 0.8% |
| Katakana | 2543 | 0.3% |
| Hangul | 2012 | 0.3% |
| Greek | 1747 | 0.2% |
| Hiragana | 1593 | 0.2% |
| Arabic | 909 | 0.1% |
| Devanagari | 831 | 0.1% |
| Other values (11) | 2419 | 0.3% |
Most frequent character per script
Han
| Value | Count | Frequency (%) |
| 人 | 65 | 1.1% |
| 女 | 64 | 1.1% |
| 大 | 57 | 1.0% |
| 場 | 56 | 1.0% |
| 劇 | 53 | 0.9% |
| 版 | 51 | 0.9% |
| 天 | 48 | 0.8% |
| 戦 | 47 | 0.8% |
| 子 | 42 | 0.7% |
| 一 | 39 | 0.7% |
| Other values (1491) | 5191 |
Hangul
| Value | Count | Frequency (%) |
| 의 | 64 | 3.2% |
| 이 | 52 | 2.6% |
| 사 | 40 | 2.0% |
| 아 | 30 | 1.5% |
| 다 | 29 | 1.4% |
| 자 | 29 | 1.4% |
| 리 | 28 | 1.4% |
| 기 | 26 | 1.3% |
| 시 | 26 | 1.3% |
| 스 | 23 | 1.1% |
| Other values (471) | 1665 |
Latin
| Value | Count | Frequency (%) |
| e | 70665 | 11.5% |
| a | 49100 | 8.0% |
| o | 42066 | 6.9% |
| i | 39494 | 6.4% |
| n | 39149 | 6.4% |
| r | 37728 | 6.2% |
| t | 33530 | 5.5% |
| s | 28615 | 4.7% |
| l | 25557 | 4.2% |
| h | 22886 | 3.7% |
| Other values (176) | 224053 |
Common
| Value | Count | Frequency (%) |
| 86293 | ||
| : | 3331 | 3.2% |
| ' | 2614 | 2.5% |
| . | 1690 | 1.6% |
| - | 1165 | 1.1% |
| , | 1133 | 1.1% |
| 2 | 841 | 0.8% |
| 1 | 688 | 0.7% |
| ! | 687 | 0.7% |
| 0 | 589 | 0.6% |
| Other values (94) | 3500 | 3.4% |
Katakana
| Value | Count | Frequency (%) |
| ン | 183 | 7.2% |
| ラ | 141 | 5.5% |
| ス | 116 | 4.6% |
| イ | 112 | 4.4% |
| ル | 112 | 4.4% |
| ト | 84 | 3.3% |
| ッ | 76 | 3.0% |
| ド | 75 | 2.9% |
| リ | 72 | 2.8% |
| ア | 71 | 2.8% |
| Other values (70) | 1501 |
Hiragana
| Value | Count | Frequency (%) |
| の | 341 | |
| い | 84 | 5.3% |
| な | 54 | 3.4% |
| と | 50 | 3.1% |
| た | 50 | 3.1% |
| る | 49 | 3.1% |
| り | 42 | 2.6% |
| き | 39 | 2.4% |
| し | 39 | 2.4% |
| ん | 36 | 2.3% |
| Other values (65) | 809 |
Cyrillic
| Value | Count | Frequency (%) |
| о | 837 | 9.3% |
| а | 771 | 8.5% |
| е | 700 | 7.8% |
| и | 656 | 7.3% |
| н | 573 | 6.4% |
| р | 505 | 5.6% |
| л | 393 | 4.4% |
| т | 372 | 4.1% |
| к | 335 | 3.7% |
| с | 323 | 3.6% |
| Other values (58) | 3558 |
Greek
| Value | Count | Frequency (%) |
| α | 172 | 9.8% |
| ο | 132 | 7.6% |
| ι | 110 | 6.3% |
| τ | 103 | 5.9% |
| ρ | 85 | 4.9% |
| ν | 71 | 4.1% |
| λ | 69 | 3.9% |
| ς | 68 | 3.9% |
| ε | 63 | 3.6% |
| η | 62 | 3.5% |
| Other values (49) | 812 |
Devanagari
| Value | Count | Frequency (%) |
| ा | 87 | 10.5% |
| र | 61 | 7.3% |
| न | 44 | 5.3% |
| ् | 42 | 5.1% |
| ल | 38 | 4.6% |
| क | 38 | 4.6% |
| ी | 37 | 4.5% |
| म | 33 | 4.0% |
| स | 32 | 3.9% |
| े | 30 | 3.6% |
| Other values (49) | 389 |
Thai
| Value | Count | Frequency (%) |
| ร | 48 | 6.7% |
| น | 46 | 6.4% |
| า | 46 | 6.4% |
| ก | 44 | 6.1% |
| อ | 31 | 4.3% |
| ั | 28 | 3.9% |
| ย | 27 | 3.7% |
| เ | 27 | 3.7% |
| ่ | 25 | 3.5% |
| ้ | 21 | 2.9% |
| Other values (46) | 378 |
Malayalam
| Value | Count | Frequency (%) |
| ് | 58 | 17.5% |
| ം | 19 | 5.7% |
| ര | 18 | 5.4% |
| പ | 17 | 5.1% |
| ാ | 16 | 4.8% |
| ി | 14 | 4.2% |
| മ | 12 | 3.6% |
| ന | 12 | 3.6% |
| റ | 11 | 3.3% |
| സ | 10 | 3.0% |
| Other values (36) | 145 |
Arabic
| Value | Count | Frequency (%) |
| ا | 117 | 12.9% |
| ر | 71 | 7.8% |
| ی | 68 | 7.5% |
| ن | 66 | 7.3% |
| د | 56 | 6.2% |
| و | 53 | 5.8% |
| ل | 51 | 5.6% |
| ه | 42 | 4.6% |
| ب | 39 | 4.3% |
| م | 35 | 3.9% |
| Other values (33) | 311 |
Bengali
| Value | Count | Frequency (%) |
| া | 33 | 12.5% |
| র | 25 | 9.5% |
| ্ | 19 | 7.2% |
| ন | 15 | 5.7% |
| ক | 14 | 5.3% |
| ত | 13 | 4.9% |
| ে | 12 | 4.6% |
| ি | 11 | 4.2% |
| ু | 10 | 3.8% |
| প | 9 | 3.4% |
| Other values (32) | 102 |
Tamil
| Value | Count | Frequency (%) |
| ் | 90 | |
| த | 42 | 7.5% |
| ா | 37 | 6.6% |
| ு | 36 | 6.4% |
| ி | 33 | 5.9% |
| ம | 30 | 5.4% |
| வ | 25 | 4.5% |
| க | 25 | 4.5% |
| ன | 23 | 4.1% |
| ட | 23 | 4.1% |
| Other values (31) | 196 |
Telugu
| Value | Count | Frequency (%) |
| ్ | 25 | 11.5% |
| ర | 16 | 7.3% |
| ా | 16 | 7.3% |
| ి | 13 | 6.0% |
| ు | 13 | 6.0% |
| ం | 12 | 5.5% |
| క | 11 | 5.0% |
| న | 11 | 5.0% |
| డ | 9 | 4.1% |
| ే | 9 | 4.1% |
| Other values (23) | 83 |
Georgian
| Value | Count | Frequency (%) |
| ი | 25 | |
| ა | 20 | |
| რ | 9 | 6.9% |
| ს | 8 | 6.2% |
| ნ | 7 | 5.4% |
| მ | 7 | 5.4% |
| ო | 7 | 5.4% |
| ლ | 6 | 4.6% |
| ბ | 6 | 4.6% |
| ე | 6 | 4.6% |
| Other values (15) | 29 |
Hebrew
| Value | Count | Frequency (%) |
| י | 24 | |
| ו | 22 | |
| א | 11 | 7.3% |
| ל | 9 | 6.0% |
| ר | 9 | 6.0% |
| ב | 9 | 6.0% |
| ת | 9 | 6.0% |
| ש | 7 | 4.6% |
| מ | 7 | 4.6% |
| ח | 6 | 4.0% |
| Other values (14) | 38 |
Armenian
| Value | Count | Frequency (%) |
| Տ | 1 | |
| վ | 1 | |
| ի | 1 | |
| թ | 1 | |
| ե | 1 | |
| ր | 1 | |
| ա | 1 |
Lao
| Value | Count | Frequency (%) |
| ນ | 1 | |
| ທ | 1 | |
| ະ | 1 | |
| ລ | 1 | |
| ີ | 1 | |
| ັ | 1 | |
| ຈ | 1 |
Kannada
| Value | Count | Frequency (%) |
| ಾ | 1 | |
| ಯ | 1 | |
| ಿ | 1 | |
| ಸ | 1 | |
| ೂ | 1 | |
| ಲ | 1 |
Inherited
| Value | Count | Frequency (%) |
| | 14 | |
| ́ | 5 | 20.8% |
| | 5 | 20.8% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 710716 | |
| Cyrillic | 9023 | 1.2% |
| None | 5949 | 0.8% |
| CJK | 5706 | 0.8% |
| Katakana | 2863 | 0.4% |
| Hangul | 2012 | 0.3% |
| Hiragana | 1593 | 0.2% |
| Arabic | 913 | 0.1% |
| Devanagari | 831 | 0.1% |
| Thai | 721 | 0.1% |
| Other values (19) | 1837 | 0.2% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 86293 | 12.1% | |
| e | 70665 | 9.9% |
| a | 49100 | 6.9% |
| o | 42066 | 5.9% |
| i | 39494 | 5.6% |
| n | 39149 | 5.5% |
| r | 37728 | 5.3% |
| t | 33530 | 4.7% |
| s | 28615 | 4.0% |
| l | 25557 | 3.6% |
| Other values (81) | 258519 |
Cyrillic
| Value | Count | Frequency (%) |
| о | 837 | 9.3% |
| а | 771 | 8.5% |
| е | 700 | 7.8% |
| и | 656 | 7.3% |
| н | 573 | 6.4% |
| р | 505 | 5.6% |
| л | 393 | 4.4% |
| т | 372 | 4.1% |
| к | 335 | 3.7% |
| с | 323 | 3.6% |
| Other values (58) | 3558 |
None
| Value | Count | Frequency (%) |
| é | 719 | 12.1% |
| ä | 443 | 7.4% |
| è | 232 | 3.9% |
| ö | 200 | 3.4% |
| á | 181 | 3.0% |
| α | 172 | 2.9% |
| í | 165 | 2.8% |
| ó | 161 | 2.7% |
| à | 146 | 2.5% |
| ü | 143 | 2.4% |
| Other values (211) | 3387 |
Hiragana
| Value | Count | Frequency (%) |
| の | 341 | |
| い | 84 | 5.3% |
| な | 54 | 3.4% |
| と | 50 | 3.1% |
| た | 50 | 3.1% |
| る | 49 | 3.1% |
| り | 42 | 2.6% |
| き | 39 | 2.4% |
| し | 39 | 2.4% |
| ん | 36 | 2.3% |
| Other values (65) | 809 |
Katakana
| Value | Count | Frequency (%) |
| ー | 245 | 8.6% |
| ン | 183 | 6.4% |
| ラ | 141 | 4.9% |
| ス | 116 | 4.1% |
| イ | 112 | 3.9% |
| ル | 112 | 3.9% |
| ト | 84 | 2.9% |
| ッ | 76 | 2.7% |
| ド | 75 | 2.6% |
| ・ | 75 | 2.6% |
| Other values (72) | 1644 |
Arabic
| Value | Count | Frequency (%) |
| ا | 117 | 12.8% |
| ر | 71 | 7.8% |
| ی | 68 | 7.4% |
| ن | 66 | 7.2% |
| د | 56 | 6.1% |
| و | 53 | 5.8% |
| ل | 51 | 5.6% |
| ه | 42 | 4.6% |
| ب | 39 | 4.3% |
| م | 35 | 3.8% |
| Other values (35) | 315 |
Tamil
| Value | Count | Frequency (%) |
| ் | 90 | |
| த | 42 | 7.5% |
| ா | 37 | 6.6% |
| ு | 36 | 6.4% |
| ி | 33 | 5.9% |
| ம | 30 | 5.4% |
| வ | 25 | 4.5% |
| க | 25 | 4.5% |
| ன | 23 | 4.1% |
| ட | 23 | 4.1% |
| Other values (31) | 196 |
Devanagari
| Value | Count | Frequency (%) |
| ा | 87 | 10.5% |
| र | 61 | 7.3% |
| न | 44 | 5.3% |
| ् | 42 | 5.1% |
| ल | 38 | 4.6% |
| क | 38 | 4.6% |
| ी | 37 | 4.5% |
| म | 33 | 4.0% |
| स | 32 | 3.9% |
| े | 30 | 3.6% |
| Other values (49) | 389 |
CJK
| Value | Count | Frequency (%) |
| 人 | 65 | 1.1% |
| 女 | 64 | 1.1% |
| 大 | 57 | 1.0% |
| 場 | 56 | 1.0% |
| 劇 | 53 | 0.9% |
| 版 | 51 | 0.9% |
| 天 | 48 | 0.8% |
| 戦 | 47 | 0.8% |
| 子 | 42 | 0.7% |
| 一 | 39 | 0.7% |
| Other values (1487) | 5184 |
Hangul
| Value | Count | Frequency (%) |
| 의 | 64 | 3.2% |
| 이 | 52 | 2.6% |
| 사 | 40 | 2.0% |
| 아 | 30 | 1.5% |
| 다 | 29 | 1.4% |
| 자 | 29 | 1.4% |
| 리 | 28 | 1.4% |
| 기 | 26 | 1.3% |
| 시 | 26 | 1.3% |
| 스 | 23 | 1.1% |
| Other values (471) | 1665 |
Malayalam
| Value | Count | Frequency (%) |
| ് | 58 | 17.5% |
| ം | 19 | 5.7% |
| ര | 18 | 5.4% |
| പ | 17 | 5.1% |
| ാ | 16 | 4.8% |
| ി | 14 | 4.2% |
| മ | 12 | 3.6% |
| ന | 12 | 3.6% |
| റ | 11 | 3.3% |
| സ | 10 | 3.0% |
| Other values (36) | 145 |
Thai
| Value | Count | Frequency (%) |
| ร | 48 | 6.7% |
| น | 46 | 6.4% |
| า | 46 | 6.4% |
| ก | 44 | 6.1% |
| อ | 31 | 4.3% |
| ั | 28 | 3.9% |
| ย | 27 | 3.7% |
| เ | 27 | 3.7% |
| ่ | 25 | 3.5% |
| ้ | 21 | 2.9% |
| Other values (46) | 378 |
Bengali
| Value | Count | Frequency (%) |
| া | 33 | 12.5% |
| র | 25 | 9.5% |
| ্ | 19 | 7.2% |
| ন | 15 | 5.7% |
| ক | 14 | 5.3% |
| ত | 13 | 4.9% |
| ে | 12 | 4.6% |
| ি | 11 | 4.2% |
| ু | 10 | 3.8% |
| প | 9 | 3.4% |
| Other values (32) | 102 |
Punctuation
| Value | Count | Frequency (%) |
| – | 32 | |
| ’ | 32 | |
| | 15 | |
| | 14 | |
| … | 11 | 9.2% |
| | 5 | 4.2% |
| — | 3 | 2.5% |
| ‧ | 2 | 1.7% |
| ― | 2 | 1.7% |
| • | 1 | 0.8% |
| Other values (3) | 3 | 2.5% |
Telugu
| Value | Count | Frequency (%) |
| ్ | 25 | 11.5% |
| ర | 16 | 7.3% |
| ా | 16 | 7.3% |
| ి | 13 | 6.0% |
| ు | 13 | 6.0% |
| ం | 12 | 5.5% |
| క | 11 | 5.0% |
| న | 11 | 5.0% |
| డ | 9 | 4.1% |
| ే | 9 | 4.1% |
| Other values (23) | 83 |
Georgian
| Value | Count | Frequency (%) |
| ი | 25 | |
| ა | 20 | |
| რ | 9 | 6.9% |
| ს | 8 | 6.2% |
| ნ | 7 | 5.4% |
| მ | 7 | 5.4% |
| ო | 7 | 5.4% |
| ლ | 6 | 4.6% |
| ბ | 6 | 4.6% |
| ე | 6 | 4.6% |
| Other values (15) | 29 |
Hebrew
| Value | Count | Frequency (%) |
| י | 24 | |
| ו | 22 | |
| א | 11 | 7.3% |
| ל | 9 | 6.0% |
| ר | 9 | 6.0% |
| ב | 9 | 6.0% |
| ת | 9 | 6.0% |
| ש | 7 | 4.6% |
| מ | 7 | 4.6% |
| ח | 6 | 4.0% |
| Other values (14) | 38 |
Misc Symbols
| Value | Count | Frequency (%) |
| ☆ | 10 | |
| ♡ | 1 | 8.3% |
| ★ | 1 | 8.3% |
Diacriticals
| Value | Count | Frequency (%) |
| ́ | 5 |
Number Forms
| Value | Count | Frequency (%) |
| Ⅱ | 2 | |
| ⅓ | 1 | |
| Ⅲ | 1 | |
| Ⅰ | 1 |
Latin Ext Additional
| Value | Count | Frequency (%) |
| ợ | 2 | |
| ủ | 2 | |
| ố | 1 | |
| ắ | 1 | |
| ộ | 1 | |
| ề | 1 | |
| ẳ | 1 | |
| ứ | 1 | |
| ồ | 1 | |
| ẫ | 1 | |
| Other values (2) | 2 |
CJK Compat Ideographs
| Value | Count | Frequency (%) |
| 琉 | 1 | |
| 龍 | 1 |
Letterlike Symbols
| Value | Count | Frequency (%) |
| ™ | 1 | |
| № | 1 |
CJK Ext A
| Value | Count | Frequency (%) |
| 㐂 | 1 |
Kannada
| Value | Count | Frequency (%) |
| ಾ | 1 | |
| ಯ | 1 | |
| ಿ | 1 | |
| ಸ | 1 | |
| ೂ | 1 | |
| ಲ | 1 |
Math Operators
| Value | Count | Frequency (%) |
| ∞ | 1 |
Arrows
| Value | Count | Frequency (%) |
| → | 1 |
Armenian
| Value | Count | Frequency (%) |
| Տ | 1 | |
| վ | 1 | |
| ի | 1 | |
| թ | 1 | |
| ե | 1 | |
| ր | 1 | |
| ա | 1 |
Lao
| Value | Count | Frequency (%) |
| ນ | 1 | |
| ທ | 1 | |
| ະ | 1 | |
| ລ | 1 | |
| ີ | 1 | |
| ັ | 1 | |
| ຈ | 1 |
overview
Text
MISSING 
| Distinct | 44307 |
|---|---|
| Distinct (%) | 99.5% |
| Missing | 954 |
| Missing (%) | 2.1% |
| Memory size | 17.8 MiB |
Length
| Max length | 1000 |
|---|---|
| Median length | 785 |
| Mean length | 323.3215537 |
| Min length | 1 |
Characters and Unicode
| Total characters | 14391689 |
|---|---|
| Distinct characters | 429 |
| Distinct categories | 25 ? |
| Distinct scripts | 13 ? |
| Distinct blocks | 21 ? |
Unique
| Unique | 44247 ? |
|---|---|
| Unique (%) | 99.4% |
Sample
| 1st row | Led by Woody, Andy's toys live happily in his room until Andy's birthday brings Buzz Lightyear onto the scene. Afraid of losing his place in Andy's heart, Woody plots against Buzz. But when circumstances separate Buzz and Woody from their owner, the duo eventually learns to put aside their differences. |
|---|---|
| 2nd row | When siblings Judy and Peter discover an enchanted board game that opens the door to a magical world, they unwittingly invite Alan -- an adult who's been trapped inside the game for 26 years -- into their living room. Alan's only hope for freedom is to finish the game, which proves risky as all three find themselves running from giant rhinoceroses, evil monkeys and other terrifying creatures. |
| 3rd row | A family wedding reignites the ancient feud between next-door neighbors and fishing buddies John and Max. Meanwhile, a sultry Italian divorcée opens a restaurant at the local bait shop, alarming the locals who worry she'll scare the fish away. But she's less interested in seafood than she is in cooking up a hot time with Max. |
| 4th row | Cheated on, mistreated and stepped on, the women are holding their breath, waiting for the elusive "good man" to break a string of less-than-stellar lovers. Friends and confidants Vannah, Bernie, Glo and Robin talk it all out, determined to find a better way to breathe. |
| 5th row | Just when George Banks has recovered from his daughter's wedding, he receives the news that she's pregnant ... and that George's wife, Nina, is expecting too. He was planning on selling their home, but that's a plan that -- like George -- will have to change with the arrival of both a grandchild and a kid of his own. |
| Value | Count | Frequency (%) |
| the | 138357 | 5.6% |
| a | 99037 | 4.0% |
| and | 75407 | 3.1% |
| to | 73442 | 3.0% |
| of | 69723 | 2.8% |
| in | 48228 | 2.0% |
| is | 36550 | 1.5% |
| his | 36210 | 1.5% |
| with | 23933 | 1.0% |
| her | 21518 | 0.9% |
| Other values (97181) | 1830623 |
Most occurring characters
| Value | Count | Frequency (%) |
| 2410599 | ||
| e | 1366183 | 9.5% |
| a | 942278 | 6.5% |
| t | 936476 | 6.5% |
| i | 853105 | 5.9% |
| o | 831419 | 5.8% |
| n | 824147 | 5.7% |
| s | 769188 | 5.3% |
| r | 745638 | 5.2% |
| h | 601821 | 4.2% |
| Other values (419) | 4110835 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 11170227 | |
| Space Separator | 2410637 | 16.8% |
| Uppercase Letter | 391751 | 2.7% |
| Other Punctuation | 313382 | 2.2% |
| Decimal Number | 42329 | 0.3% |
| Dash Punctuation | 36848 | 0.3% |
| Close Punctuation | 10112 | 0.1% |
| Open Punctuation | 10090 | 0.1% |
| Final Punctuation | 4560 | < 0.1% |
| Initial Punctuation | 884 | < 0.1% |
| Other values (15) | 869 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 1366183 | |
| a | 942278 | 8.4% |
| t | 936476 | 8.4% |
| i | 853105 | 7.6% |
| o | 831419 | 7.4% |
| n | 824147 | 7.4% |
| s | 769188 | 6.9% |
| r | 745638 | 6.7% |
| h | 601821 | 5.4% |
| l | 479703 | 4.3% |
| Other values (142) | 2820269 |
Uppercase Letter
| Value | Count | Frequency (%) |
| A | 42831 | 10.9% |
| T | 36041 | 9.2% |
| S | 31203 | 8.0% |
| M | 24000 | 6.1% |
| B | 23750 | 6.1% |
| C | 22837 | 5.8% |
| H | 19463 | 5.0% |
| W | 18685 | 4.8% |
| I | 16837 | 4.3% |
| D | 16347 | 4.2% |
| Other values (77) | 139757 |
Other Letter
| Value | Count | Frequency (%) |
| र | 6 | 4.8% |
| न | 6 | 4.8% |
| म | 5 | 4.0% |
| の | 4 | 3.2% |
| प | 3 | 2.4% |
| द | 3 | 2.4% |
| ద | 3 | 2.4% |
| अ | 3 | 2.4% |
| న | 2 | 1.6% |
| ल | 2 | 1.6% |
| Other values (76) | 88 |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 133694 | |
| . | 124991 | |
| ' | 31173 | 9.9% |
| " | 11693 | 3.7% |
| : | 3306 | 1.1% |
| ? | 2765 | 0.9% |
| ; | 2496 | 0.8% |
| ! | 1546 | 0.5% |
| / | 769 | 0.2% |
| & | 455 | 0.1% |
| Other values (12) | 494 | 0.2% |
Nonspacing Mark
| Value | Count | Frequency (%) |
| ి | 4 | |
| ́ | 4 | |
| ̈ | 3 | |
| ् | 3 | |
| ్ | 3 | |
| ் | 3 | |
| े | 2 | 6.1% |
| ं | 2 | 6.1% |
| ु | 2 | 6.1% |
| ా | 2 | 6.1% |
| Other values (4) | 5 |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 9770 | |
| 0 | 8292 | |
| 9 | 6422 | |
| 2 | 4265 | |
| 5 | 2446 | 5.8% |
| 8 | 2384 | 5.6% |
| 3 | 2346 | 5.5% |
| 4 | 2181 | 5.2% |
| 7 | 2135 | 5.0% |
| 6 | 2088 | 4.9% |
Spacing Mark
| Value | Count | Frequency (%) |
| ा | 11 | |
| ी | 4 | 14.8% |
| ు | 3 | 11.1% |
| ो | 3 | 11.1% |
| ि | 2 | 7.4% |
| ு | 2 | 7.4% |
| ం | 1 | 3.7% |
| ி | 1 | 3.7% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 35321 | |
| – | 885 | 2.4% |
| — | 633 | 1.7% |
| ― | 5 | < 0.1% |
| ‐ | 4 | < 0.1% |
Other Symbol
| Value | Count | Frequency (%) |
| ® | 45 | |
| ™ | 14 | 21.9% |
| ° | 2 | 3.1% |
| ¦ | 2 | 3.1% |
| � | 1 | 1.6% |
Math Symbol
| Value | Count | Frequency (%) |
| ~ | 20 | |
| + | 12 | |
| = | 6 | 14.0% |
| | | 4 | 9.3% |
| − | 1 | 2.3% |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 10036 | |
| [ | 51 | 0.5% |
| { | 2 | < 0.1% |
| „ | 1 | < 0.1% |
Currency Symbol
| Value | Count | Frequency (%) |
| $ | 318 | |
| £ | 10 | 3.0% |
| ₹ | 1 | 0.3% |
| € | 1 | 0.3% |
Space Separator
| Value | Count | Frequency (%) |
| 2410599 | ||
| 36 | < 0.1% | |
| 2 | < 0.1% |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 10060 | |
| ] | 50 | 0.5% |
| } | 2 | < 0.1% |
Final Punctuation
| Value | Count | Frequency (%) |
| ’ | 3850 | |
| ” | 691 | 15.2% |
| » | 19 | 0.4% |
Initial Punctuation
| Value | Count | Frequency (%) |
| “ | 673 | |
| ‘ | 193 | 21.8% |
| « | 18 | 2.0% |
Control
| Value | Count | Frequency (%) |
| 106 | ||
| | 3 | 2.7% |
| | 1 | 0.9% |
Modifier Symbol
| Value | Count | Frequency (%) |
| ´ | 25 | |
| ` | 12 | |
| ¯ | 1 | 2.6% |
Format
| Value | Count | Frequency (%) |
| | 31 | |
| | 20 |
Other Number
| Value | Count | Frequency (%) |
| ½ | 8 | |
| ¹ | 8 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 19 |
Line Separator
| Value | Count | Frequency (%) |
| 7 |
Paragraph Separator
| Value | Count | Frequency (%) |
| 2 |
Modifier Letter
| Value | Count | Frequency (%) |
| ʼ | 2 |
Letter Number
| Value | Count | Frequency (%) |
| Ⅱ | 2 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 11556746 | |
| Common | 2829524 | 19.7% |
| Cyrillic | 4587 | < 0.1% |
| Greek | 648 | < 0.1% |
| Devanagari | 77 | < 0.1% |
| Telugu | 30 | < 0.1% |
| Hiragana | 20 | < 0.1% |
| Tamil | 19 | < 0.1% |
| Han | 10 | < 0.1% |
| Hangul | 9 | < 0.1% |
| Other values (3) | 19 | < 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 1366183 | |
| a | 942278 | 8.2% |
| t | 936476 | 8.1% |
| i | 853105 | 7.4% |
| o | 831419 | 7.2% |
| n | 824147 | 7.1% |
| s | 769188 | 6.7% |
| r | 745638 | 6.5% |
| h | 601821 | 5.2% |
| l | 479703 | 4.2% |
| Other values (132) | 3206788 |
Common
| Value | Count | Frequency (%) |
| 2410599 | ||
| , | 133694 | 4.7% |
| . | 124991 | 4.4% |
| - | 35321 | 1.2% |
| ' | 31173 | 1.1% |
| " | 11693 | 0.4% |
| ) | 10060 | 0.4% |
| ( | 10036 | 0.4% |
| 1 | 9770 | 0.3% |
| 0 | 8292 | 0.3% |
| Other values (71) | 43895 | 1.6% |
Cyrillic
| Value | Count | Frequency (%) |
| о | 470 | 10.2% |
| е | 404 | 8.8% |
| а | 373 | 8.1% |
| н | 323 | 7.0% |
| и | 299 | 6.5% |
| т | 265 | 5.8% |
| р | 240 | 5.2% |
| с | 218 | 4.8% |
| в | 173 | 3.8% |
| л | 161 | 3.5% |
| Other values (46) | 1661 |
Greek
| Value | Count | Frequency (%) |
| α | 60 | 9.3% |
| ο | 55 | 8.5% |
| τ | 43 | 6.6% |
| η | 36 | 5.6% |
| ι | 36 | 5.6% |
| ν | 34 | 5.2% |
| ε | 31 | 4.8% |
| ρ | 31 | 4.8% |
| π | 30 | 4.6% |
| ς | 30 | 4.6% |
| Other values (33) | 262 |
Devanagari
| Value | Count | Frequency (%) |
| ा | 11 | 14.3% |
| र | 6 | 7.8% |
| न | 6 | 7.8% |
| म | 5 | 6.5% |
| ी | 4 | 5.2% |
| प | 3 | 3.9% |
| द | 3 | 3.9% |
| ् | 3 | 3.9% |
| ो | 3 | 3.9% |
| अ | 3 | 3.9% |
| Other values (21) | 30 |
Hiragana
| Value | Count | Frequency (%) |
| の | 4 | |
| と | 1 | 5.0% |
| そ | 1 | 5.0% |
| ち | 1 | 5.0% |
| め | 1 | 5.0% |
| さ | 1 | 5.0% |
| ひ | 1 | 5.0% |
| ず | 1 | 5.0% |
| か | 1 | 5.0% |
| み | 1 | 5.0% |
| Other values (7) | 7 |
Telugu
| Value | Count | Frequency (%) |
| ి | 4 | |
| ు | 3 | |
| ్ | 3 | |
| ద | 3 | |
| న | 2 | 6.7% |
| మ | 2 | 6.7% |
| ర | 2 | 6.7% |
| ా | 2 | 6.7% |
| స | 2 | 6.7% |
| జ | 1 | 3.3% |
| Other values (6) | 6 |
Tamil
| Value | Count | Frequency (%) |
| ் | 3 | |
| ப | 2 | |
| ு | 2 | |
| ர | 2 | |
| ம | 2 | |
| ஆ | 1 | 5.3% |
| த | 1 | 5.3% |
| வ | 1 | 5.3% |
| ன | 1 | 5.3% |
| ச | 1 | 5.3% |
| Other values (3) | 3 |
Han
| Value | Count | Frequency (%) |
| 患 | 1 | |
| 者 | 1 | |
| 水 | 1 | |
| 世 | 1 | |
| 界 | 1 | |
| 俣 | 1 | |
| 見 | 1 | |
| 鬼 | 1 | |
| 難 | 1 | |
| 海 | 1 |
Hangul
| Value | Count | Frequency (%) |
| 사 | 2 | |
| 회 | 1 | |
| 식 | 1 | |
| 주 | 1 | |
| 기 | 1 | |
| 찾 | 1 | |
| 랑 | 1 | |
| 첫 | 1 |
Thai
| Value | Count | Frequency (%) |
| ่ | 2 | |
| ส | 1 | |
| ี | 1 | |
| แ | 1 | |
| พ | 1 | |
| ร | 1 | |
| ง | 1 |
Arabic
| Value | Count | Frequency (%) |
| م | 2 | |
| ہ | 1 | |
| ت | 1 |
Inherited
| Value | Count | Frequency (%) |
| ́ | 4 | |
| ̈ | 3 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 14373677 | |
| Punctuation | 7281 | 0.1% |
| None | 5933 | < 0.1% |
| Cyrillic | 4587 | < 0.1% |
| Devanagari | 77 | < 0.1% |
| Telugu | 30 | < 0.1% |
| Hiragana | 20 | < 0.1% |
| Tamil | 19 | < 0.1% |
| Letterlike Symbols | 14 | < 0.1% |
| CJK | 10 | < 0.1% |
| Other values (11) | 41 | < 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2410599 | ||
| e | 1366183 | 9.5% |
| a | 942278 | 6.6% |
| t | 936476 | 6.5% |
| i | 853105 | 5.9% |
| o | 831419 | 5.8% |
| n | 824147 | 5.7% |
| s | 769188 | 5.4% |
| r | 745638 | 5.2% |
| h | 601821 | 4.2% |
| Other values (82) | 4092823 |
Punctuation
| Value | Count | Frequency (%) |
| ’ | 3850 | |
| – | 885 | 12.2% |
| ” | 691 | 9.5% |
| “ | 673 | 9.2% |
| — | 633 | 8.7% |
| … | 304 | 4.2% |
| ‘ | 193 | 2.7% |
| | 31 | 0.4% |
| 7 | 0.1% | |
| ― | 5 | 0.1% |
| Other values (4) | 9 | 0.1% |
None
| Value | Count | Frequency (%) |
| é | 1552 | |
| ä | 294 | 5.0% |
| á | 293 | 4.9% |
| ö | 250 | 4.2% |
| í | 244 | 4.1% |
| è | 209 | 3.5% |
| ü | 178 | 3.0% |
| ı | 165 | 2.8% |
| ó | 164 | 2.8% |
| ç | 158 | 2.7% |
| Other values (141) | 2426 |
Cyrillic
| Value | Count | Frequency (%) |
| о | 470 | 10.2% |
| е | 404 | 8.8% |
| а | 373 | 8.1% |
| н | 323 | 7.0% |
| и | 299 | 6.5% |
| т | 265 | 5.8% |
| р | 240 | 5.2% |
| с | 218 | 4.8% |
| в | 173 | 3.8% |
| л | 161 | 3.5% |
| Other values (46) | 1661 |
Letterlike Symbols
| Value | Count | Frequency (%) |
| ™ | 14 |
Devanagari
| Value | Count | Frequency (%) |
| ा | 11 | 14.3% |
| र | 6 | 7.8% |
| न | 6 | 7.8% |
| म | 5 | 6.5% |
| ी | 4 | 5.2% |
| प | 3 | 3.9% |
| द | 3 | 3.9% |
| ् | 3 | 3.9% |
| ो | 3 | 3.9% |
| अ | 3 | 3.9% |
| Other values (21) | 30 |
Hiragana
| Value | Count | Frequency (%) |
| の | 4 | |
| と | 1 | 5.0% |
| そ | 1 | 5.0% |
| ち | 1 | 5.0% |
| め | 1 | 5.0% |
| さ | 1 | 5.0% |
| ひ | 1 | 5.0% |
| ず | 1 | 5.0% |
| か | 1 | 5.0% |
| み | 1 | 5.0% |
| Other values (7) | 7 |
Telugu
| Value | Count | Frequency (%) |
| ి | 4 | |
| ు | 3 | |
| ్ | 3 | |
| ద | 3 | |
| న | 2 | 6.7% |
| మ | 2 | 6.7% |
| ర | 2 | 6.7% |
| ా | 2 | 6.7% |
| స | 2 | 6.7% |
| జ | 1 | 3.3% |
| Other values (6) | 6 |
Diacriticals
| Value | Count | Frequency (%) |
| ́ | 4 | |
| ̈ | 3 |
Alphabetic PF
| Value | Count | Frequency (%) |
| fi | 4 |
Tamil
| Value | Count | Frequency (%) |
| ் | 3 | |
| ப | 2 | |
| ு | 2 | |
| ர | 2 | |
| ம | 2 | |
| ஆ | 1 | 5.3% |
| த | 1 | 5.3% |
| வ | 1 | 5.3% |
| ன | 1 | 5.3% |
| ச | 1 | 5.3% |
| Other values (3) | 3 |
Hangul
| Value | Count | Frequency (%) |
| 사 | 2 | |
| 회 | 1 | |
| 식 | 1 | |
| 주 | 1 | |
| 기 | 1 | |
| 찾 | 1 | |
| 랑 | 1 | |
| 첫 | 1 |
Arabic
| Value | Count | Frequency (%) |
| م | 2 | |
| ہ | 1 | |
| ت | 1 |
Thai
| Value | Count | Frequency (%) |
| ่ | 2 | |
| ส | 1 | |
| ี | 1 | |
| แ | 1 | |
| พ | 1 | |
| ร | 1 | |
| ง | 1 |
Modifier Letters
| Value | Count | Frequency (%) |
| ʼ | 2 |
Number Forms
| Value | Count | Frequency (%) |
| Ⅱ | 2 |
CJK
| Value | Count | Frequency (%) |
| 患 | 1 | |
| 者 | 1 | |
| 水 | 1 | |
| 世 | 1 | |
| 界 | 1 | |
| 俣 | 1 | |
| 見 | 1 | |
| 鬼 | 1 | |
| 難 | 1 | |
| 海 | 1 |
Math Operators
| Value | Count | Frequency (%) |
| − | 1 |
Katakana
| Value | Count | Frequency (%) |
| ・ | 1 |
Currency Symbols
| Value | Count | Frequency (%) |
| ₹ | 1 | |
| € | 1 |
Specials
| Value | Count | Frequency (%) |
| � | 1 |
popularity
Unsupported
REJECTED  UNSUPPORTED 
| Missing | 5 |
|---|---|
| Missing (%) | < 0.1% |
| Memory size | 1.8 MiB |
poster_path
Text
| Distinct | 45024 |
|---|---|
| Distinct (%) | 99.9% |
| Missing | 386 |
| Missing (%) | 0.8% |
| Memory size | 3.8 MiB |
Length
| Max length | 35 |
|---|---|
| Median length | 32 |
| Mean length | 31.97162822 |
| Min length | 12 |
Characters and Unicode
| Total characters | 1441281 |
|---|---|
| Distinct characters | 66 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 44977 ? |
|---|---|
| Unique (%) | 99.8% |
Sample
| 1st row | /rhIRbceoE9lR4veEXuwCC2wARtG.jpg |
|---|---|
| 2nd row | /vzmL6fP7aPKNKPRTFnZmiUfciyV.jpg |
| 3rd row | /6ksm1sjKMFLbO7UY2i6G1ju9SML.jpg |
| 4th row | /16XOMpEaLWkrcPqSQqhTmeJuqQl.jpg |
| 5th row | /e64sOI48hQXyru7naBFyssKFxVd.jpg |
| Value | Count | Frequency (%) |
| 5d7ubsegdyone6lql6xs7s6olcw.jpg | 5 | < 0.1% |
| qw1oqlohizrhxzqrpkimyr0oxzn.jpg | 4 | < 0.1% |
| 2kslzxoaw0hmnguvpcnqlcdxfr9.jpg | 4 | < 0.1% |
| cdwvc18urfedqjjxqjyrmogdc0h.jpg | 3 | < 0.1% |
| 8vsz9coczxocw2we2qene1h1fko.jpg | 3 | < 0.1% |
| bql0pvhbq8jmw3njcl38kw0coem.jpg | 2 | < 0.1% |
| w56oo9nrecf54snxvyue9qxzfjt.jpg | 2 | < 0.1% |
| xue1ilucohbxmy0fiqktt6d013n.jpg | 2 | < 0.1% |
| g21ruzz3bzeudukmb82kejjtufk.jpg | 2 | < 0.1% |
| iqd7zwhsece3cgdpclidxjgfdzl.jpg | 2 | < 0.1% |
| Other values (45020) | 45057 |
Most occurring characters
| Value | Count | Frequency (%) |
| g | 65293 | 4.5% |
| p | 65148 | 4.5% |
| j | 65043 | 4.5% |
| / | 45077 | 3.1% |
| . | 45077 | 3.1% |
| v | 20444 | 1.4% |
| d | 20329 | 1.4% |
| m | 20322 | 1.4% |
| q | 20256 | 1.4% |
| t | 20248 | 1.4% |
| Other values (56) | 1054044 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 659135 | |
| Uppercase Letter | 492145 | |
| Decimal Number | 199840 | 13.9% |
| Other Punctuation | 90155 | 6.3% |
| Space Separator | 6 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| g | 65293 | 9.9% |
| p | 65148 | 9.9% |
| j | 65043 | 9.9% |
| v | 20444 | 3.1% |
| d | 20329 | 3.1% |
| m | 20322 | 3.1% |
| q | 20256 | 3.1% |
| t | 20248 | 3.1% |
| n | 20233 | 3.1% |
| l | 20232 | 3.1% |
| Other values (16) | 321587 |
Uppercase Letter
| Value | Count | Frequency (%) |
| A | 19398 | 3.9% |
| R | 19191 | 3.9% |
| M | 19170 | 3.9% |
| C | 19151 | 3.9% |
| W | 19140 | 3.9% |
| V | 19138 | 3.9% |
| T | 18983 | 3.9% |
| K | 18965 | 3.9% |
| L | 18955 | 3.9% |
| D | 18953 | 3.9% |
| Other values (16) | 301101 |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 20218 | |
| 8 | 20217 | |
| 3 | 20157 | |
| 9 | 20106 | |
| 5 | 20101 | |
| 2 | 20051 | |
| 6 | 20009 | |
| 4 | 20007 | |
| 7 | 19898 | |
| 0 | 19076 |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 45077 | |
| . | 45077 | |
| : | 1 | < 0.1% |
Space Separator
| Value | Count | Frequency (%) |
| 6 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 1151280 | |
| Common | 290001 | 20.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| g | 65293 | 5.7% |
| p | 65148 | 5.7% |
| j | 65043 | 5.6% |
| v | 20444 | 1.8% |
| d | 20329 | 1.8% |
| m | 20322 | 1.8% |
| q | 20256 | 1.8% |
| t | 20248 | 1.8% |
| n | 20233 | 1.8% |
| l | 20232 | 1.8% |
| Other values (42) | 813732 |
Common
| Value | Count | Frequency (%) |
| / | 45077 | |
| . | 45077 | |
| 1 | 20218 | |
| 8 | 20217 | |
| 3 | 20157 | |
| 9 | 20106 | |
| 5 | 20101 | |
| 2 | 20051 | |
| 6 | 20009 | |
| 4 | 20007 | |
| Other values (4) | 38981 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1441281 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| g | 65293 | 4.5% |
| p | 65148 | 4.5% |
| j | 65043 | 4.5% |
| / | 45077 | 3.1% |
| . | 45077 | 3.1% |
| v | 20444 | 1.4% |
| d | 20329 | 1.4% |
| m | 20322 | 1.4% |
| q | 20256 | 1.4% |
| t | 20248 | 1.4% |
| Other values (56) | 1054044 |
| Distinct | 22708 |
|---|---|
| Distinct (%) | 49.9% |
| Missing | 3 |
| Missing (%) | < 0.1% |
| Memory size | 5.6 MiB |
Length
| Max length | 1252 |
|---|---|
| Median length | 954 |
| Mean length | 70.09882762 |
| Min length | 2 |
Characters and Unicode
| Total characters | 3186903 |
|---|---|
| Distinct characters | 293 |
| Distinct categories | 15 ? |
| Distinct scripts | 6 ? |
| Distinct blocks | 6 ? |
Unique
| Unique | 20344 ? |
|---|---|
| Unique (%) | 44.7% |
Sample
| 1st row | [{'name': 'Pixar Animation Studios', 'id': 3}] |
|---|---|
| 2nd row | [{'name': 'TriStar Pictures', 'id': 559}, {'name': 'Teitler Film', 'id': 2550}, {'name': 'Interscope Communications', 'id': 10201}] |
| 3rd row | [{'name': 'Warner Bros.', 'id': 6194}, {'name': 'Lancaster Gate', 'id': 19464}] |
| 4th row | [{'name': 'Twentieth Century Fox Film Corporation', 'id': 306}] |
| 5th row | [{'name': 'Sandollar Productions', 'id': 5842}, {'name': 'Touchstone Pictures', 'id': 9195}] |
| Value | Count | Frequency (%) |
| id | 70546 | 17.6% |
| name | 70546 | 17.6% |
| 12719 | 3.2% | |
| films | 9457 | 2.4% |
| pictures | 9267 | 2.3% |
| productions | 9061 | 2.3% |
| film | 6679 | 1.7% |
| entertainment | 5156 | 1.3% |
| corporation | 2190 | 0.5% |
| company | 1769 | 0.4% |
| Other values (42195) | 203834 |
Most occurring characters
| Value | Count | Frequency (%) |
| ' | 422867 | 13.3% |
| 355774 | 11.2% | |
| i | 177505 | 5.6% |
| e | 165212 | 5.2% |
| n | 160535 | 5.0% |
| a | 147709 | 4.6% |
| : | 141099 | 4.4% |
| m | 114830 | 3.6% |
| , | 107909 | 3.4% |
| d | 104017 | 3.3% |
| Other values (283) | 1289446 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 1410509 | |
| Other Punctuation | 680026 | |
| Space Separator | 355774 | 11.2% |
| Decimal Number | 295745 | 9.3% |
| Uppercase Letter | 199007 | 6.2% |
| Open Punctuation | 120335 | 3.8% |
| Close Punctuation | 120334 | 3.8% |
| Dash Punctuation | 4331 | 0.1% |
| Math Symbol | 662 | < 0.1% |
| Other Letter | 140 | < 0.1% |
| Other values (5) | 40 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| i | 177505 | |
| e | 165212 | |
| n | 160535 | |
| a | 147709 | |
| m | 114830 | |
| d | 104017 | |
| o | 85310 | 6.0% |
| r | 83560 | 5.9% |
| t | 83459 | 5.9% |
| s | 62684 | 4.4% |
| Other values (102) | 225688 |
Other Letter
| Value | Count | Frequency (%) |
| 스 | 9 | 6.4% |
| 트 | 8 | 5.7% |
| 인 | 6 | 4.3% |
| 테 | 5 | 3.6% |
| 먼 | 5 | 3.6% |
| 터 | 5 | 3.6% |
| 엔 | 5 | 3.6% |
| 주 | 5 | 3.6% |
| 픽 | 4 | 2.9% |
| 디 | 3 | 2.1% |
| Other values (62) | 85 |
Uppercase Letter
| Value | Count | Frequency (%) |
| P | 27882 | |
| F | 26367 | |
| C | 20589 | 10.3% |
| M | 13363 | 6.7% |
| S | 11914 | 6.0% |
| E | 9750 | 4.9% |
| A | 9550 | 4.8% |
| T | 9357 | 4.7% |
| B | 9006 | 4.5% |
| G | 7812 | 3.9% |
| Other values (52) | 53417 |
Other Punctuation
| Value | Count | Frequency (%) |
| ' | 422867 | |
| : | 141099 | 20.7% |
| , | 107909 | 15.9% |
| . | 5671 | 0.8% |
| " | 987 | 0.1% |
| & | 765 | 0.1% |
| / | 645 | 0.1% |
| ! | 36 | < 0.1% |
| % | 18 | < 0.1% |
| \ | 12 | < 0.1% |
| Other values (6) | 17 | < 0.1% |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 45079 | |
| 2 | 33554 | |
| 3 | 31849 | |
| 4 | 30685 | |
| 6 | 28094 | |
| 5 | 27816 | |
| 8 | 25853 | |
| 7 | 24553 | |
| 9 | 24362 | |
| 0 | 23900 |
Close Punctuation
| Value | Count | Frequency (%) |
| } | 70545 | |
| ] | 45469 | |
| ) | 4319 | 3.6% |
| ) | 1 | < 0.1% |
Open Punctuation
| Value | Count | Frequency (%) |
| { | 70545 | |
| [ | 45469 | |
| ( | 4320 | 3.6% |
| ( | 1 | < 0.1% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 4329 | |
| – | 2 | < 0.1% |
Math Symbol
| Value | Count | Frequency (%) |
| + | 661 | |
| | | 1 | 0.2% |
Other Symbol
| Value | Count | Frequency (%) |
| ° | 23 | |
| ㈜ | 2 | 8.0% |
Final Punctuation
| Value | Count | Frequency (%) |
| ’ | 3 | |
| » | 3 |
Other Number
| Value | Count | Frequency (%) |
| ½ | 1 | |
| ² | 1 |
Space Separator
| Value | Count | Frequency (%) |
| 355774 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 4 |
Initial Punctuation
| Value | Count | Frequency (%) |
| « | 3 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 1609113 | |
| Common | 1577245 | |
| Cyrillic | 373 | < 0.1% |
| Hangul | 115 | < 0.1% |
| Greek | 31 | < 0.1% |
| Han | 26 | < 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| i | 177505 | |
| e | 165212 | 10.3% |
| n | 160535 | 10.0% |
| a | 147709 | 9.2% |
| m | 114830 | 7.1% |
| d | 104017 | 6.5% |
| o | 85310 | 5.3% |
| r | 83560 | 5.2% |
| t | 83459 | 5.2% |
| s | 62684 | 3.9% |
| Other values (99) | 424292 |
Hangul
| Value | Count | Frequency (%) |
| 스 | 9 | 7.8% |
| 트 | 8 | 7.0% |
| 인 | 6 | 5.2% |
| 테 | 5 | 4.3% |
| 먼 | 5 | 4.3% |
| 터 | 5 | 4.3% |
| 엔 | 5 | 4.3% |
| 주 | 5 | 4.3% |
| 픽 | 4 | 3.5% |
| 디 | 3 | 2.6% |
| Other values (43) | 60 |
Common
| Value | Count | Frequency (%) |
| ' | 422867 | |
| 355774 | ||
| : | 141099 | 8.9% |
| , | 107909 | 6.8% |
| } | 70545 | 4.5% |
| { | 70545 | 4.5% |
| [ | 45469 | 2.9% |
| ] | 45469 | 2.9% |
| 1 | 45079 | 2.9% |
| 2 | 33554 | 2.1% |
| Other values (36) | 238935 |
Cyrillic
| Value | Count | Frequency (%) |
| и | 34 | 9.1% |
| о | 28 | 7.5% |
| а | 26 | 7.0% |
| л | 22 | 5.9% |
| н | 20 | 5.4% |
| м | 19 | 5.1% |
| т | 17 | 4.6% |
| ь | 16 | 4.3% |
| с | 16 | 4.3% |
| е | 16 | 4.3% |
| Other values (36) | 159 |
Greek
| Value | Count | Frequency (%) |
| ο | 3 | 9.7% |
| ν | 3 | 9.7% |
| η | 2 | 6.5% |
| λ | 2 | 6.5% |
| Ε | 2 | 6.5% |
| ι | 2 | 6.5% |
| ρ | 2 | 6.5% |
| τ | 2 | 6.5% |
| Κ | 2 | 6.5% |
| ό | 1 | 3.2% |
| Other values (10) | 10 |
Han
| Value | Count | Frequency (%) |
| 北 | 2 | 7.7% |
| 京 | 2 | 7.7% |
| 影 | 2 | 7.7% |
| 司 | 2 | 7.7% |
| 公 | 2 | 7.7% |
| 限 | 2 | 7.7% |
| 有 | 2 | 7.7% |
| 乐 | 1 | 3.8% |
| 电 | 1 | 3.8% |
| 发 | 1 | 3.8% |
| Other values (9) | 9 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 3180679 | |
| None | 5706 | 0.2% |
| Cyrillic | 373 | < 0.1% |
| Hangul | 113 | < 0.1% |
| CJK | 26 | < 0.1% |
| Punctuation | 6 | < 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| ' | 422867 | 13.3% |
| 355774 | 11.2% | |
| i | 177505 | 5.6% |
| e | 165212 | 5.2% |
| n | 160535 | 5.0% |
| a | 147709 | 4.6% |
| : | 141099 | 4.4% |
| m | 114830 | 3.6% |
| , | 107909 | 3.4% |
| d | 104017 | 3.3% |
| Other values (78) | 1283222 |
None
| Value | Count | Frequency (%) |
| é | 3176 | |
| ó | 416 | 7.3% |
| á | 317 | 5.6% |
| í | 173 | 3.0% |
| ü | 154 | 2.7% |
| ñ | 150 | 2.6% |
| ô | 140 | 2.5% |
| ä | 137 | 2.4% |
| è | 136 | 2.4% |
| ö | 132 | 2.3% |
| Other values (75) | 775 | 13.6% |
Cyrillic
| Value | Count | Frequency (%) |
| и | 34 | 9.1% |
| о | 28 | 7.5% |
| а | 26 | 7.0% |
| л | 22 | 5.9% |
| н | 20 | 5.4% |
| м | 19 | 5.1% |
| т | 17 | 4.6% |
| ь | 16 | 4.3% |
| с | 16 | 4.3% |
| е | 16 | 4.3% |
| Other values (36) | 159 |
Hangul
| Value | Count | Frequency (%) |
| 스 | 9 | 8.0% |
| 트 | 8 | 7.1% |
| 인 | 6 | 5.3% |
| 테 | 5 | 4.4% |
| 먼 | 5 | 4.4% |
| 터 | 5 | 4.4% |
| 엔 | 5 | 4.4% |
| 주 | 5 | 4.4% |
| 픽 | 4 | 3.5% |
| 디 | 3 | 2.7% |
| Other values (42) | 58 |
Punctuation
| Value | Count | Frequency (%) |
| ’ | 3 | |
| – | 2 | |
| • | 1 | 16.7% |
CJK
| Value | Count | Frequency (%) |
| 北 | 2 | 7.7% |
| 京 | 2 | 7.7% |
| 影 | 2 | 7.7% |
| 司 | 2 | 7.7% |
| 公 | 2 | 7.7% |
| 限 | 2 | 7.7% |
| 有 | 2 | 7.7% |
| 乐 | 1 | 3.8% |
| 电 | 1 | 3.8% |
| 发 | 1 | 3.8% |
| Other values (9) | 9 |
| Distinct | 2393 |
|---|---|
| Distinct (%) | 5.3% |
| Missing | 3 |
| Missing (%) | < 0.1% |
| Memory size | 4.8 MiB |
Length
| Max length | 1039 |
|---|---|
| Median length | 649 |
| Mean length | 53.20049271 |
| Min length | 2 |
Characters and Unicode
| Total characters | 2418654 |
|---|---|
| Distinct characters | 69 |
| Distinct categories | 8 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 1768 ? |
|---|---|
| Unique (%) | 3.9% |
Sample
| 1st row | [{'iso_3166_1': 'US', 'name': 'United States of America'}] |
|---|---|
| 2nd row | [{'iso_3166_1': 'US', 'name': 'United States of America'}] |
| 3rd row | [{'iso_3166_1': 'US', 'name': 'United States of America'}] |
| 4th row | [{'iso_3166_1': 'US', 'name': 'United States of America'}] |
| 5th row | [{'iso_3166_1': 'US', 'name': 'United States of America'}] |
| Value | Count | Frequency (%) |
| iso_3166_1 | 49423 | |
| name | 49423 | |
| united | 25275 | |
| states | 21154 | |
| of | 21153 | |
| america | 21153 | |
| us | 21153 | |
| 6282 | 2.3% | |
| gb | 4094 | 1.5% |
| kingdom | 4094 | 1.5% |
| Other values (344) | 50140 |
Most occurring characters
| Value | Count | Frequency (%) |
| ' | 395379 | |
| 227881 | 9.4% | |
| e | 130095 | 5.4% |
| a | 119929 | 5.0% |
| i | 107991 | 4.5% |
| 6 | 98847 | 4.1% |
| _ | 98846 | 4.1% |
| 1 | 98846 | 4.1% |
| : | 98846 | 4.1% |
| n | 96933 | 4.0% |
| Other values (59) | 945061 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 904689 | |
| Other Punctuation | 553906 | |
| Decimal Number | 247121 | 10.2% |
| Space Separator | 227881 | 9.4% |
| Uppercase Letter | 196445 | 8.1% |
| Connector Punctuation | 98846 | 4.1% |
| Open Punctuation | 94883 | 3.9% |
| Close Punctuation | 94883 | 3.9% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 130095 | |
| a | 119929 | |
| i | 107991 | |
| n | 96933 | |
| o | 79012 | |
| m | 78136 | |
| s | 74119 | |
| t | 72641 | |
| d | 34558 | 3.8% |
| r | 32498 | 3.6% |
| Other values (16) | 78777 |
Uppercase Letter
| Value | Count | Frequency (%) |
| U | 48407 | |
| S | 46889 | |
| A | 25531 | |
| F | 8678 | 4.4% |
| R | 7997 | 4.1% |
| I | 7601 | 3.9% |
| G | 6924 | 3.5% |
| K | 6810 | 3.5% |
| B | 5860 | 3.0% |
| C | 5367 | 2.7% |
| Other values (16) | 26381 |
Decimal Number
| Value | Count | Frequency (%) |
| 6 | 98847 | |
| 1 | 98846 | |
| 3 | 49424 | |
| 0 | 2 | < 0.1% |
| 7 | 1 | < 0.1% |
| 4 | 1 | < 0.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| ' | 395379 | |
| : | 98846 | 17.8% |
| , | 59668 | 10.8% |
| " | 10 | < 0.1% |
| . | 3 | < 0.1% |
Open Punctuation
| Value | Count | Frequency (%) |
| { | 49423 | |
| [ | 45460 |
Close Punctuation
| Value | Count | Frequency (%) |
| } | 49423 | |
| ] | 45460 |
Space Separator
| Value | Count | Frequency (%) |
| 227881 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 98846 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 1317520 | |
| Latin | 1101134 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 130095 | |
| a | 119929 | |
| i | 107991 | |
| n | 96933 | 8.8% |
| o | 79012 | 7.2% |
| m | 78136 | 7.1% |
| s | 74119 | 6.7% |
| t | 72641 | 6.6% |
| U | 48407 | 4.4% |
| S | 46889 | 4.3% |
| Other values (42) | 246982 |
Common
| Value | Count | Frequency (%) |
| ' | 395379 | |
| 227881 | ||
| 6 | 98847 | 7.5% |
| _ | 98846 | 7.5% |
| 1 | 98846 | 7.5% |
| : | 98846 | 7.5% |
| , | 59668 | 4.5% |
| 3 | 49424 | 3.8% |
| { | 49423 | 3.8% |
| } | 49423 | 3.8% |
| Other values (7) | 90937 | 6.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2418654 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| ' | 395379 | |
| 227881 | 9.4% | |
| e | 130095 | 5.4% |
| a | 119929 | 5.0% |
| i | 107991 | 4.5% |
| 6 | 98847 | 4.1% |
| _ | 98846 | 4.1% |
| 1 | 98846 | 4.1% |
| : | 98846 | 4.1% |
| n | 96933 | 4.0% |
| Other values (59) | 945061 |
release_date
Text
| Distinct | 17336 |
|---|---|
| Distinct (%) | 38.2% |
| Missing | 87 |
| Missing (%) | 0.2% |
| Memory size | 2.9 MiB |
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 9.999449084 |
| Min length | 1 |
Characters and Unicode
| Total characters | 453765 |
|---|---|
| Distinct characters | 11 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 8573 ? |
|---|---|
| Unique (%) | 18.9% |
Sample
| 1st row | 1995-10-30 |
|---|---|
| 2nd row | 1995-12-15 |
| 3rd row | 1995-12-22 |
| 4th row | 1995-12-22 |
| 5th row | 1995-02-10 |
| Value | Count | Frequency (%) |
| 2008-01-01 | 136 | 0.3% |
| 2009-01-01 | 121 | 0.3% |
| 2007-01-01 | 118 | 0.3% |
| 2005-01-01 | 111 | 0.2% |
| 2006-01-01 | 101 | 0.2% |
| 2002-01-01 | 96 | 0.2% |
| 2004-01-01 | 90 | 0.2% |
| 2001-01-01 | 84 | 0.2% |
| 2003-01-01 | 76 | 0.2% |
| 1997-01-01 | 69 | 0.2% |
| Other values (17326) | 44377 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 97600 | |
| - | 90752 | |
| 1 | 84056 | |
| 2 | 52806 | |
| 9 | 39773 | |
| 3 | 15435 | 3.4% |
| 8 | 15279 | 3.4% |
| 6 | 15021 | 3.3% |
| 5 | 14836 | 3.3% |
| 7 | 14289 | 3.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 363013 | |
| Dash Punctuation | 90752 | 20.0% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 97600 | |
| 1 | 84056 | |
| 2 | 52806 | |
| 9 | 39773 | |
| 3 | 15435 | 4.3% |
| 8 | 15279 | 4.2% |
| 6 | 15021 | 4.1% |
| 5 | 14836 | 4.1% |
| 7 | 14289 | 3.9% |
| 4 | 13918 | 3.8% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 90752 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 453765 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 97600 | |
| - | 90752 | |
| 1 | 84056 | |
| 2 | 52806 | |
| 9 | 39773 | |
| 3 | 15435 | 3.4% |
| 8 | 15279 | 3.4% |
| 6 | 15021 | 3.3% |
| 5 | 14836 | 3.3% |
| 7 | 14289 | 3.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 453765 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 97600 | |
| - | 90752 | |
| 1 | 84056 | |
| 2 | 52806 | |
| 9 | 39773 | |
| 3 | 15435 | 3.4% |
| 8 | 15279 | 3.4% |
| 6 | 15021 | 3.3% |
| 5 | 14836 | 3.3% |
| 7 | 14289 | 3.1% |
revenue
Real number (ℝ)
ZEROS 
| Distinct | 6863 |
|---|---|
| Distinct (%) | 15.1% |
| Missing | 6 |
| Missing (%) | < 0.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 11209348.54 |
| Minimum | 0 |
|---|---|
| Maximum | 2787965087 |
| Zeros | 38052 |
| Zeros (%) | 83.7% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 355.3 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 47808918.5 |
| Maximum | 2787965087 |
| Range | 2787965087 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 64332246.74 |
|---|---|
| Coefficient of variation (CV) | 5.739160176 |
| Kurtosis | 237.5105858 |
| Mean | 11209348.54 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 12.26598291 |
| Sum | 5.095769846 × 1011 |
| Variance | 4.138637971 × 1015 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 38052 | |
| 12000000 | 20 | < 0.1% |
| 10000000 | 19 | < 0.1% |
| 11000000 | 19 | < 0.1% |
| 2000000 | 18 | < 0.1% |
| 6000000 | 17 | < 0.1% |
| 5000000 | 14 | < 0.1% |
| 500000 | 13 | < 0.1% |
| 8000000 | 13 | < 0.1% |
| 1 | 12 | < 0.1% |
| Other values (6853) | 7263 | 16.0% |
| Value | Count | Frequency (%) |
| 0 | 38052 | |
| 1 | 12 | < 0.1% |
| 2 | 3 | < 0.1% |
| 3 | 9 | < 0.1% |
| 4 | 4 | < 0.1% |
| Value | Count | Frequency (%) |
| 2787965087 | 1 | |
| 2068223624 | 1 | |
| 1845034188 | 1 | |
| 1519557910 | 1 | |
| 1513528810 | 1 |
runtime
Real number (ℝ)
ZEROS 
| Distinct | 353 |
|---|---|
| Distinct (%) | 0.8% |
| Missing | 263 |
| Missing (%) | 0.6% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 94.12819946 |
| Minimum | 0 |
|---|---|
| Maximum | 1256 |
| Zeros | 1558 |
| Zeros (%) | 3.4% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 355.3 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 11 |
| Q1 | 85 |
| median | 95 |
| Q3 | 107 |
| 95-th percentile | 138 |
| Maximum | 1256 |
| Range | 1256 |
| Interquartile range (IQR) | 22 |
Descriptive statistics
| Standard deviation | 38.40781049 |
|---|---|
| Coefficient of variation (CV) | 0.4080372376 |
| Kurtosis | 93.21715769 |
| Mean | 94.12819946 |
| Median Absolute Deviation (MAD) | 11 |
| Skewness | 4.465957935 |
| Sum | 4254877 |
| Variance | 1475.159906 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 90 | 2556 | 5.6% |
| 0 | 1558 | 3.4% |
| 100 | 1470 | 3.2% |
| 95 | 1412 | 3.1% |
| 93 | 1214 | 2.7% |
| 96 | 1104 | 2.4% |
| 92 | 1080 | 2.4% |
| 94 | 1062 | 2.3% |
| 91 | 1057 | 2.3% |
| 88 | 1032 | 2.3% |
| Other values (343) | 31658 |
| Value | Count | Frequency (%) |
| 0 | 1558 | |
| 1 | 107 | 0.2% |
| 2 | 33 | 0.1% |
| 3 | 48 | 0.1% |
| 4 | 51 | 0.1% |
| Value | Count | Frequency (%) |
| 1256 | 1 | |
| 1140 | 2 | |
| 931 | 1 | |
| 925 | 1 | |
| 900 | 1 |
spoken_languages
Text
| Distinct | 1931 |
|---|---|
| Distinct (%) | 4.2% |
| Missing | 6 |
| Missing (%) | < 0.1% |
| Memory size | 5.3 MiB |
Length
| Max length | 765 |
|---|---|
| Median length | 40 |
| Mean length | 46.92828861 |
| Min length | 2 |
Characters and Unicode
| Total characters | 2133360 |
|---|---|
| Distinct characters | 184 |
| Distinct categories | 11 ? |
| Distinct scripts | 15 ? |
| Distinct blocks | 16 ? |
Unique
| Unique | 1366 ? |
|---|---|
| Unique (%) | 3.0% |
Sample
| 1st row | [{'iso_639_1': 'en', 'name': 'English'}] |
|---|---|
| 2nd row | [{'iso_639_1': 'en', 'name': 'English'}, {'iso_639_1': 'fr', 'name': 'Français'}] |
| 3rd row | [{'iso_639_1': 'en', 'name': 'English'}] |
| 4th row | [{'iso_639_1': 'en', 'name': 'English'}] |
| 5th row | [{'iso_639_1': 'en', 'name': 'English'}] |
| Value | Count | Frequency (%) |
| iso_639_1 | 53300 | |
| name | 53300 | |
| english | 28745 | |
| en | 28745 | |
| 4809 | 2.2% | |
| fr | 4196 | 1.9% |
| français | 4196 | 1.9% |
| deutsch | 2625 | 1.2% |
| de | 2625 | 1.2% |
| es | 2413 | 1.1% |
| Other values (203) | 33488 |
Most occurring characters
| Value | Count | Frequency (%) |
| ' | 426400 | |
| 172982 | 8.1% | |
| n | 120605 | 5.7% |
| _ | 106600 | 5.0% |
| : | 106600 | 5.0% |
| s | 99222 | 4.7% |
| i | 94120 | 4.4% |
| e | 92748 | 4.3% |
| a | 75235 | 3.5% |
| , | 64969 | 3.0% |
| Other values (174) | 773879 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 771936 | |
| Other Punctuation | 599060 | |
| Decimal Number | 213226 | 10.0% |
| Space Separator | 172982 | 8.1% |
| Connector Punctuation | 106600 | 5.0% |
| Close Punctuation | 98760 | 4.6% |
| Open Punctuation | 98760 | 4.6% |
| Uppercase Letter | 46453 | 2.2% |
| Other Letter | 22196 | 1.0% |
| Spacing Mark | 1838 | 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| n | 120605 | |
| s | 99222 | |
| i | 94120 | |
| e | 92748 | |
| a | 75235 | |
| o | 61255 | |
| m | 54012 | |
| l | 36054 | 4.7% |
| h | 33831 | 4.4% |
| g | 30529 | 4.0% |
| Other values (65) | 74325 |
Other Letter
| Value | Count | Frequency (%) |
| 語 | 1758 | 7.9% |
| 本 | 1758 | 7.9% |
| 日 | 1758 | 7.9% |
| 话 | 1263 | 5.7% |
| 州 | 946 | 4.3% |
| 普 | 790 | 3.6% |
| 通 | 790 | 3.6% |
| द | 707 | 3.2% |
| ह | 707 | 3.2% |
| न | 707 | 3.2% |
| Other values (46) | 11012 |
Uppercase Letter
| Value | Count | Frequency (%) |
| E | 31215 | |
| F | 4198 | 9.0% |
| D | 2927 | 6.3% |
| P | 2678 | 5.8% |
| I | 2367 | 5.1% |
| N | 830 | 1.8% |
| L | 506 | 1.1% |
| M | 363 | 0.8% |
| T | 308 | 0.7% |
| Č | 284 | 0.6% |
| Other values (13) | 777 | 1.7% |
Spacing Mark
| Value | Count | Frequency (%) |
| ी | 707 | |
| ि | 707 | |
| ు | 136 | 7.4% |
| ி | 111 | 6.0% |
| া | 94 | 5.1% |
| ং | 47 | 2.6% |
| ਾ | 18 | 1.0% |
| ੀ | 18 | 1.0% |
Other Punctuation
| Value | Count | Frequency (%) |
| ' | 426400 | |
| : | 106600 | 17.8% |
| , | 64969 | 10.8% |
| / | 1015 | 0.2% |
| ? | 50 | < 0.1% |
| \ | 26 | < 0.1% |
Nonspacing Mark
| Value | Count | Frequency (%) |
| ् | 707 | |
| ִ | 430 | |
| ְ | 215 | 13.9% |
| ் | 111 | 7.2% |
| ె | 68 | 4.4% |
| ੰ | 18 | 1.2% |
Decimal Number
| Value | Count | Frequency (%) |
| 9 | 53326 | |
| 3 | 53300 | |
| 6 | 53300 | |
| 1 | 53300 |
Close Punctuation
| Value | Count | Frequency (%) |
| } | 53300 | |
| ] | 45460 |
Open Punctuation
| Value | Count | Frequency (%) |
| { | 53300 | |
| [ | 45460 |
Space Separator
| Value | Count | Frequency (%) |
| 172982 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 106600 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 1289388 | |
| Latin | 805994 | |
| Han | 10482 | 0.5% |
| Cyrillic | 10460 | 0.5% |
| Devanagari | 4242 | 0.2% |
| Arabic | 3349 | 0.2% |
| Hangul | 3252 | 0.2% |
| Hebrew | 1720 | 0.1% |
| Greek | 1704 | 0.1% |
| Thai | 1232 | 0.1% |
| Other values (5) | 1537 | 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| n | 120605 | |
| s | 99222 | |
| i | 94120 | |
| e | 92748 | |
| a | 75235 | |
| o | 61255 | |
| m | 54012 | |
| l | 36054 | 4.5% |
| h | 33831 | 4.2% |
| E | 31215 | 3.9% |
| Other values (52) | 107697 |
Cyrillic
| Value | Count | Frequency (%) |
| с | 3213 | |
| к | 1735 | |
| и | 1680 | |
| й | 1616 | |
| у | 1565 | |
| а | 113 | 1.1% |
| р | 87 | 0.8% |
| У | 53 | 0.5% |
| ї | 53 | 0.5% |
| н | 53 | 0.5% |
| Other values (12) | 292 | 2.8% |
Common
| Value | Count | Frequency (%) |
| ' | 426400 | |
| 172982 | ||
| _ | 106600 | 8.3% |
| : | 106600 | 8.3% |
| , | 64969 | 5.0% |
| 9 | 53326 | 4.1% |
| } | 53300 | 4.1% |
| { | 53300 | 4.1% |
| 3 | 53300 | 4.1% |
| 6 | 53300 | 4.1% |
| Other values (6) | 145311 | 11.3% |
Arabic
| Value | Count | Frequency (%) |
| ا | 538 | |
| ر | 538 | |
| ي | 341 | |
| ب | 341 | |
| ع | 341 | |
| ل | 341 | |
| ة | 341 | |
| ف | 142 | 4.2% |
| س | 142 | 4.2% |
| ی | 142 | 4.2% |
| Other values (5) | 142 | 4.2% |
Han
| Value | Count | Frequency (%) |
| 語 | 1758 | |
| 本 | 1758 | |
| 日 | 1758 | |
| 话 | 1263 | |
| 州 | 946 | |
| 普 | 790 | |
| 通 | 790 | |
| 話 | 473 | 4.5% |
| 广 | 473 | 4.5% |
| 廣 | 473 | 4.5% |
Hebrew
| Value | Count | Frequency (%) |
| ִ | 430 | |
| ת | 215 | |
| ע | 215 | |
| ר | 215 | |
| י | 215 | |
| ְ | 215 | |
| ב | 215 |
Greek
| Value | Count | Frequency (%) |
| λ | 426 | |
| ν | 213 | |
| ά | 213 | |
| κ | 213 | |
| η | 213 | |
| ε | 213 | |
| ι | 213 |
Georgian
| Value | Count | Frequency (%) |
| ი | 33 | |
| უ | 33 | |
| თ | 33 | |
| რ | 33 | |
| ა | 33 | |
| ქ | 33 | |
| ლ | 33 |
Devanagari
| Value | Count | Frequency (%) |
| ी | 707 | |
| द | 707 | |
| ् | 707 | |
| ह | 707 | |
| न | 707 | |
| ि | 707 |
Hangul
| Value | Count | Frequency (%) |
| 선 | 542 | |
| 말 | 542 | |
| 조 | 542 | |
| 어 | 542 | |
| 한 | 542 | |
| 국 | 542 |
Thai
| Value | Count | Frequency (%) |
| า | 352 | |
| ภ | 176 | |
| ไ | 176 | |
| ท | 176 | |
| ย | 176 | |
| ษ | 176 |
Gurmukhi
| Value | Count | Frequency (%) |
| ਾ | 18 | |
| ਬ | 18 | |
| ੀ | 18 | |
| ਜ | 18 | |
| ੰ | 18 | |
| ਪ | 18 |
Telugu
| Value | Count | Frequency (%) |
| ు | 136 | |
| ె | 68 | |
| ల | 68 | |
| గ | 68 | |
| త | 68 |
Tamil
| Value | Count | Frequency (%) |
| ம | 111 | |
| ் | 111 | |
| ழ | 111 | |
| ி | 111 | |
| த | 111 |
Bengali
| Value | Count | Frequency (%) |
| া | 94 | |
| ং | 47 | |
| ব | 47 | |
| ল | 47 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2086548 | |
| CJK | 10482 | 0.5% |
| Cyrillic | 10460 | 0.5% |
| None | 10412 | 0.5% |
| Devanagari | 4242 | 0.2% |
| Arabic | 3349 | 0.2% |
| Hangul | 3252 | 0.2% |
| Hebrew | 1720 | 0.1% |
| Thai | 1232 | 0.1% |
| Tamil | 555 | < 0.1% |
| Other values (6) | 1108 | 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| ' | 426400 | |
| 172982 | 8.3% | |
| n | 120605 | 5.8% |
| _ | 106600 | 5.1% |
| : | 106600 | 5.1% |
| s | 99222 | 4.8% |
| i | 94120 | 4.5% |
| e | 92748 | 4.4% |
| a | 75235 | 3.6% |
| , | 64969 | 3.1% |
| Other values (52) | 727067 |
None
| Value | Count | Frequency (%) |
| ç | 4443 | |
| ñ | 2413 | |
| ê | 591 | 5.7% |
| λ | 426 | 4.1% |
| ý | 284 | 2.7% |
| Č | 284 | 2.7% |
| ü | 247 | 2.4% |
| ν | 213 | 2.0% |
| ά | 213 | 2.0% |
| κ | 213 | 2.0% |
| Other values (10) | 1085 | 10.4% |
Cyrillic
| Value | Count | Frequency (%) |
| с | 3213 | |
| к | 1735 | |
| и | 1680 | |
| й | 1616 | |
| у | 1565 | |
| а | 113 | 1.1% |
| р | 87 | 0.8% |
| У | 53 | 0.5% |
| ї | 53 | 0.5% |
| н | 53 | 0.5% |
| Other values (12) | 292 | 2.8% |
CJK
| Value | Count | Frequency (%) |
| 語 | 1758 | |
| 本 | 1758 | |
| 日 | 1758 | |
| 话 | 1263 | |
| 州 | 946 | |
| 普 | 790 | |
| 通 | 790 | |
| 話 | 473 | 4.5% |
| 广 | 473 | 4.5% |
| 廣 | 473 | 4.5% |
Devanagari
| Value | Count | Frequency (%) |
| ी | 707 | |
| द | 707 | |
| ् | 707 | |
| ह | 707 | |
| न | 707 | |
| ि | 707 |
Hangul
| Value | Count | Frequency (%) |
| 선 | 542 | |
| 말 | 542 | |
| 조 | 542 | |
| 어 | 542 | |
| 한 | 542 | |
| 국 | 542 |
Arabic
| Value | Count | Frequency (%) |
| ا | 538 | |
| ر | 538 | |
| ي | 341 | |
| ب | 341 | |
| ع | 341 | |
| ل | 341 | |
| ة | 341 | |
| ف | 142 | 4.2% |
| س | 142 | 4.2% |
| ی | 142 | 4.2% |
| Other values (5) | 142 | 4.2% |
Hebrew
| Value | Count | Frequency (%) |
| ִ | 430 | |
| ת | 215 | |
| ע | 215 | |
| ר | 215 | |
| י | 215 | |
| ְ | 215 | |
| ב | 215 |
Thai
| Value | Count | Frequency (%) |
| า | 352 | |
| ภ | 176 | |
| ไ | 176 | |
| ท | 176 | |
| ย | 176 | |
| ษ | 176 |
Telugu
| Value | Count | Frequency (%) |
| ు | 136 | |
| ె | 68 | |
| ల | 68 | |
| గ | 68 | |
| త | 68 |
Tamil
| Value | Count | Frequency (%) |
| ம | 111 | |
| ் | 111 | |
| ழ | 111 | |
| ி | 111 | |
| த | 111 |
Bengali
| Value | Count | Frequency (%) |
| া | 94 | |
| ং | 47 | |
| ব | 47 | |
| ল | 47 |
Latin Ext Additional
| Value | Count | Frequency (%) |
| ế | 61 | |
| ệ | 61 |
Georgian
| Value | Count | Frequency (%) |
| ი | 33 | |
| უ | 33 | |
| თ | 33 | |
| რ | 33 | |
| ა | 33 | |
| ქ | 33 | |
| ლ | 33 |
Gurmukhi
| Value | Count | Frequency (%) |
| ਾ | 18 | |
| ਬ | 18 | |
| ੀ | 18 | |
| ਜ | 18 | |
| ੰ | 18 | |
| ਪ | 18 |
IPA Ext
| Value | Count | Frequency (%) |
| ə | 4 |
status
Text
| Distinct | 6 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 87 |
| Missing (%) | 0.2% |
| Memory size | 2.8 MiB |
Length
| Max length | 15 |
|---|---|
| Median length | 8 |
| Mean length | 8.011921814 |
| Min length | 7 |
Characters and Unicode
| Total characters | 363573 |
|---|---|
| Distinct characters | 18 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Released |
|---|---|
| 2nd row | Released |
| 3rd row | Released |
| 4th row | Released |
| 5th row | Released |
| Value | Count | Frequency (%) |
| released | 45014 | |
| rumored | 230 | 0.5% |
| production | 118 | 0.3% |
| post | 98 | 0.2% |
| in | 20 | < 0.1% |
| planned | 15 | < 0.1% |
| canceled | 2 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 135291 | |
| d | 45379 | 12.5% |
| R | 45244 | 12.4% |
| s | 45112 | 12.4% |
| l | 45031 | 12.4% |
| a | 45031 | 12.4% |
| o | 564 | 0.2% |
| r | 348 | 0.1% |
| u | 348 | 0.1% |
| P | 231 | 0.1% |
| Other values (8) | 994 | 0.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 317958 | |
| Uppercase Letter | 45497 | 12.5% |
| Space Separator | 118 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 135291 | |
| d | 45379 | 14.3% |
| s | 45112 | 14.2% |
| l | 45031 | 14.2% |
| a | 45031 | 14.2% |
| o | 564 | 0.2% |
| r | 348 | 0.1% |
| u | 348 | 0.1% |
| m | 230 | 0.1% |
| t | 216 | 0.1% |
| Other values (3) | 408 | 0.1% |
Uppercase Letter
| Value | Count | Frequency (%) |
| R | 45244 | |
| P | 231 | 0.5% |
| I | 20 | < 0.1% |
| C | 2 | < 0.1% |
Space Separator
| Value | Count | Frequency (%) |
| 118 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 363455 | |
| Common | 118 | < 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 135291 | |
| d | 45379 | 12.5% |
| R | 45244 | 12.4% |
| s | 45112 | 12.4% |
| l | 45031 | 12.4% |
| a | 45031 | 12.4% |
| o | 564 | 0.2% |
| r | 348 | 0.1% |
| u | 348 | 0.1% |
| P | 231 | 0.1% |
| Other values (7) | 876 | 0.2% |
Common
| Value | Count | Frequency (%) |
| 118 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 363573 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 135291 | |
| d | 45379 | 12.5% |
| R | 45244 | 12.4% |
| s | 45112 | 12.4% |
| l | 45031 | 12.4% |
| a | 45031 | 12.4% |
| o | 564 | 0.2% |
| r | 348 | 0.1% |
| u | 348 | 0.1% |
| P | 231 | 0.1% |
| Other values (8) | 994 | 0.3% |
tagline
Text
MISSING 
| Distinct | 20283 |
|---|---|
| Distinct (%) | 99.4% |
| Missing | 25054 |
| Missing (%) | 55.1% |
| Memory size | 2.8 MiB |
Length
| Max length | 297 |
|---|---|
| Median length | 204 |
| Mean length | 47.00284147 |
| Min length | 1 |
Characters and Unicode
| Total characters | 959422 |
|---|---|
| Distinct characters | 170 |
| Distinct categories | 17 ? |
| Distinct scripts | 6 ? |
| Distinct blocks | 10 ? |
Unique
| Unique | 20177 ? |
|---|---|
| Unique (%) | 98.8% |
Sample
| 1st row | Roll the dice and unleash the excitement! |
|---|---|
| 2nd row | Still Yelling. Still Fighting. Still Ready for Love. |
| 3rd row | Friends are the people who let you be yourself... and never let you forget it. |
| 4th row | Just When His World Is Back To Normal... He's In For The Surprise Of His Life! |
| 5th row | A Los Angeles Crime Saga |
| Value | Count | Frequency (%) |
| the | 11004 | 6.3% |
| a | 6820 | 3.9% |
| of | 4406 | 2.5% |
| to | 3586 | 2.1% |
| is | 2800 | 1.6% |
| in | 2693 | 1.5% |
| and | 2686 | 1.5% |
| you | 2389 | 1.4% |
| 1585 | 0.9% | |
| for | 1524 | 0.9% |
| Other values (15108) | 134566 |
Most occurring characters
| Value | Count | Frequency (%) |
| 153795 | ||
| e | 94486 | 9.8% |
| t | 57309 | 6.0% |
| o | 56611 | 5.9% |
| a | 51521 | 5.4% |
| n | 47539 | 5.0% |
| i | 46086 | 4.8% |
| r | 45029 | 4.7% |
| s | 42399 | 4.4% |
| h | 37192 | 3.9% |
| Other values (160) | 327455 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 681040 | |
| Space Separator | 153795 | 16.0% |
| Uppercase Letter | 75028 | 7.8% |
| Other Punctuation | 44604 | 4.6% |
| Decimal Number | 2687 | 0.3% |
| Dash Punctuation | 1948 | 0.2% |
| Final Punctuation | 98 | < 0.1% |
| Open Punctuation | 56 | < 0.1% |
| Close Punctuation | 55 | < 0.1% |
| Currency Symbol | 37 | < 0.1% |
| Other values (7) | 74 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 94486 | |
| t | 57309 | 8.4% |
| o | 56611 | 8.3% |
| a | 51521 | 7.6% |
| n | 47539 | 7.0% |
| i | 46086 | 6.8% |
| r | 45029 | 6.6% |
| s | 42399 | 6.2% |
| h | 37192 | 5.5% |
| l | 30199 | 4.4% |
| Other values (43) | 172669 |
Other Letter
| Value | Count | Frequency (%) |
| 劇 | 1 | 2.9% |
| ஆ | 1 | 2.9% |
| த | 1 | 2.9% |
| வ | 1 | 2.9% |
| 時 | 1 | 2.9% |
| ன | 1 | 2.9% |
| 熟 | 1 | 2.9% |
| 場 | 1 | 2.9% |
| 版 | 1 | 2.9% |
| ク | 1 | 2.9% |
| Other values (24) | 24 |
Uppercase Letter
| Value | Count | Frequency (%) |
| T | 10013 | 13.3% |
| A | 6878 | 9.2% |
| S | 5653 | 7.5% |
| H | 4404 | 5.9% |
| I | 4387 | 5.8% |
| E | 4307 | 5.7% |
| W | 3683 | 4.9% |
| O | 3479 | 4.6% |
| L | 3196 | 4.3% |
| N | 3196 | 4.3% |
| Other values (20) | 25832 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 26655 | |
| ! | 5785 | 13.0% |
| ' | 5676 | 12.7% |
| , | 4231 | 9.5% |
| ? | 1161 | 2.6% |
| " | 582 | 1.3% |
| … | 148 | 0.3% |
| : | 138 | 0.3% |
| & | 84 | 0.2% |
| * | 42 | 0.1% |
| Other values (7) | 102 | 0.2% |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 802 | |
| 1 | 516 | |
| 2 | 299 | 11.1% |
| 3 | 208 | 7.7% |
| 9 | 208 | 7.7% |
| 5 | 168 | 6.3% |
| 4 | 140 | 5.2% |
| 7 | 121 | 4.5% |
| 6 | 121 | 4.5% |
| 8 | 104 | 3.9% |
Math Symbol
| Value | Count | Frequency (%) |
| = | 5 | |
| + | 5 | |
| | | 2 | 14.3% |
| ~ | 1 | 7.1% |
| − | 1 | 7.1% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 1931 | |
| – | 9 | 0.5% |
| — | 8 | 0.4% |
Final Punctuation
| Value | Count | Frequency (%) |
| ’ | 82 | |
| ” | 15 | 15.3% |
| » | 1 | 1.0% |
Initial Punctuation
| Value | Count | Frequency (%) |
| “ | 14 | |
| ‘ | 4 | 21.1% |
| « | 1 | 5.3% |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 49 | |
| [ | 7 | 12.5% |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 48 | |
| ] | 7 | 12.7% |
Other Number
| Value | Count | Frequency (%) |
| ½ | 2 | |
| ² | 1 |
Modifier Letter
| Value | Count | Frequency (%) |
| ˌ | 1 | |
| ˈ | 1 |
Space Separator
| Value | Count | Frequency (%) |
| 153795 |
Currency Symbol
| Value | Count | Frequency (%) |
| $ | 37 |
Nonspacing Mark
| Value | Count | Frequency (%) |
| ் | 1 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 1 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 756068 | |
| Common | 203319 | 21.2% |
| Han | 21 | < 0.1% |
| Tamil | 5 | < 0.1% |
| Hiragana | 5 | < 0.1% |
| Katakana | 4 | < 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 94486 | 12.5% |
| t | 57309 | 7.6% |
| o | 56611 | 7.5% |
| a | 51521 | 6.8% |
| n | 47539 | 6.3% |
| i | 46086 | 6.1% |
| r | 45029 | 6.0% |
| s | 42399 | 5.6% |
| h | 37192 | 4.9% |
| l | 30199 | 4.0% |
| Other values (73) | 247697 |
Common
| Value | Count | Frequency (%) |
| 153795 | ||
| . | 26655 | 13.1% |
| ! | 5785 | 2.8% |
| ' | 5676 | 2.8% |
| , | 4231 | 2.1% |
| - | 1931 | 0.9% |
| ? | 1161 | 0.6% |
| 0 | 802 | 0.4% |
| " | 582 | 0.3% |
| 1 | 516 | 0.3% |
| Other values (42) | 2185 | 1.1% |
Han
| Value | Count | Frequency (%) |
| 劇 | 1 | 4.8% |
| 時 | 1 | 4.8% |
| 熟 | 1 | 4.8% |
| 場 | 1 | 4.8% |
| 版 | 1 | 4.8% |
| 桃 | 1 | 4.8% |
| 最 | 1 | 4.8% |
| 后 | 1 | 4.8% |
| 的 | 1 | 4.8% |
| 舞 | 1 | 4.8% |
| Other values (11) | 11 |
Tamil
| Value | Count | Frequency (%) |
| ஆ | 1 | |
| த | 1 | |
| வ | 1 | |
| ன | 1 | |
| ் | 1 |
Hiragana
| Value | Count | Frequency (%) |
| る | 1 | |
| は | 1 | |
| し | 1 | |
| て | 1 | |
| い | 1 |
Katakana
| Value | Count | Frequency (%) |
| ク | 1 | |
| ラ | 1 | |
| ナ | 1 | |
| ド | 1 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 958992 | |
| Punctuation | 280 | < 0.1% |
| None | 110 | < 0.1% |
| CJK | 21 | < 0.1% |
| Tamil | 5 | < 0.1% |
| Hiragana | 5 | < 0.1% |
| Katakana | 4 | < 0.1% |
| IPA Ext | 2 | < 0.1% |
| Modifier Letters | 2 | < 0.1% |
| Math Operators | 1 | < 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 153795 | ||
| e | 94486 | 9.9% |
| t | 57309 | 6.0% |
| o | 56611 | 5.9% |
| a | 51521 | 5.4% |
| n | 47539 | 5.0% |
| i | 46086 | 4.8% |
| r | 45029 | 4.7% |
| s | 42399 | 4.4% |
| h | 37192 | 3.9% |
| Other values (78) | 327025 |
Punctuation
| Value | Count | Frequency (%) |
| … | 148 | |
| ’ | 82 | |
| ” | 15 | 5.4% |
| “ | 14 | 5.0% |
| – | 9 | 3.2% |
| — | 8 | 2.9% |
| ‘ | 4 | 1.4% |
None
| Value | Count | Frequency (%) |
| é | 18 | |
| ä | 16 | |
| ö | 8 | 7.3% |
| á | 6 | 5.5% |
| ó | 6 | 5.5% |
| í | 5 | 4.5% |
| ü | 5 | 4.5% |
| ı | 5 | 4.5% |
| · | 4 | 3.6% |
| ñ | 3 | 2.7% |
| Other values (26) | 34 |
IPA Ext
| Value | Count | Frequency (%) |
| ə | 2 |
CJK
| Value | Count | Frequency (%) |
| 劇 | 1 | 4.8% |
| 時 | 1 | 4.8% |
| 熟 | 1 | 4.8% |
| 場 | 1 | 4.8% |
| 版 | 1 | 4.8% |
| 桃 | 1 | 4.8% |
| 最 | 1 | 4.8% |
| 后 | 1 | 4.8% |
| 的 | 1 | 4.8% |
| 舞 | 1 | 4.8% |
| Other values (11) | 11 |
Tamil
| Value | Count | Frequency (%) |
| ஆ | 1 | |
| த | 1 | |
| வ | 1 | |
| ன | 1 | |
| ் | 1 |
Modifier Letters
| Value | Count | Frequency (%) |
| ˌ | 1 | |
| ˈ | 1 |
Katakana
| Value | Count | Frequency (%) |
| ク | 1 | |
| ラ | 1 | |
| ナ | 1 | |
| ド | 1 |
Hiragana
| Value | Count | Frequency (%) |
| る | 1 | |
| は | 1 | |
| し | 1 | |
| て | 1 | |
| い | 1 |
Math Operators
| Value | Count | Frequency (%) |
| − | 1 |
title
Text
| Distinct | 42277 |
|---|---|
| Distinct (%) | 93.0% |
| Missing | 6 |
| Missing (%) | < 0.1% |
| Memory size | 3.2 MiB |
Length
| Max length | 105 |
|---|---|
| Median length | 79 |
| Mean length | 16.70853498 |
| Min length | 1 |
Characters and Unicode
| Total characters | 759570 |
|---|---|
| Distinct characters | 287 |
| Distinct categories | 17 ? |
| Distinct scripts | 7 ? |
| Distinct blocks | 12 ? |
Unique
| Unique | 39947 ? |
|---|---|
| Unique (%) | 87.9% |
Sample
| 1st row | Toy Story |
|---|---|
| 2nd row | Jumanji |
| 3rd row | Grumpier Old Men |
| 4th row | Waiting to Exhale |
| 5th row | Father of the Bride Part II |
| Value | Count | Frequency (%) |
| the | 14571 | 10.7% |
| of | 4938 | 3.6% |
| a | 2244 | 1.6% |
| in | 1697 | 1.2% |
| and | 1634 | 1.2% |
| to | 1055 | 0.8% |
| 763 | 0.6% | |
| man | 665 | 0.5% |
| love | 664 | 0.5% |
| for | 602 | 0.4% |
| Other values (24431) | 107634 |
Most occurring characters
| Value | Count | Frequency (%) |
| 91029 | 12.0% | |
| e | 76408 | 10.1% |
| a | 49056 | 6.5% |
| o | 45765 | 6.0% |
| n | 40931 | 5.4% |
| r | 40096 | 5.3% |
| i | 39859 | 5.2% |
| t | 36792 | 4.8% |
| s | 29591 | 3.9% |
| h | 28564 | 3.8% |
| Other values (277) | 281479 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 535372 | |
| Uppercase Letter | 117493 | 15.5% |
| Space Separator | 91029 | 12.0% |
| Other Punctuation | 10513 | 1.4% |
| Decimal Number | 3863 | 0.5% |
| Dash Punctuation | 986 | 0.1% |
| Close Punctuation | 87 | < 0.1% |
| Open Punctuation | 85 | < 0.1% |
| Final Punctuation | 38 | < 0.1% |
| Other Letter | 25 | < 0.1% |
| Other values (7) | 79 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 76408 | |
| a | 49056 | |
| o | 45765 | 8.5% |
| n | 40931 | 7.6% |
| r | 40096 | 7.5% |
| i | 39859 | 7.4% |
| t | 36792 | 6.9% |
| s | 29591 | 5.5% |
| h | 28564 | 5.3% |
| l | 25992 | 4.9% |
| Other values (121) | 122318 |
Uppercase Letter
| Value | Count | Frequency (%) |
| T | 16037 | |
| S | 10354 | 8.8% |
| M | 8042 | 6.8% |
| B | 7674 | 6.5% |
| C | 7175 | 6.1% |
| A | 6808 | 5.8% |
| D | 6355 | 5.4% |
| L | 5883 | 5.0% |
| H | 5183 | 4.4% |
| W | 5175 | 4.4% |
| Other values (65) | 38807 |
Other Letter
| Value | Count | Frequency (%) |
| ی | 2 | 8.0% |
| ک | 2 | 8.0% |
| چ | 2 | 8.0% |
| ه | 2 | 8.0% |
| 時 | 1 | 4.0% |
| 狗 | 1 | 4.0% |
| 空 | 1 | 4.0% |
| 傳 | 1 | 4.0% |
| ª | 1 | 4.0% |
| ا | 1 | 4.0% |
| Other values (11) | 11 |
Other Punctuation
| Value | Count | Frequency (%) |
| : | 3727 | |
| ' | 2512 | |
| . | 1604 | |
| , | 1136 | 10.8% |
| ! | 648 | 6.2% |
| & | 460 | 4.4% |
| ? | 269 | 2.6% |
| / | 80 | 0.8% |
| * | 19 | 0.2% |
| # | 13 | 0.1% |
| Other values (8) | 45 | 0.4% |
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 864 | |
| 1 | 699 | |
| 0 | 619 | |
| 3 | 484 | |
| 9 | 230 | 6.0% |
| 4 | 229 | 5.9% |
| 5 | 225 | 5.8% |
| 7 | 196 | 5.1% |
| 8 | 161 | 4.2% |
| 6 | 156 | 4.0% |
Math Symbol
| Value | Count | Frequency (%) |
| + | 17 | |
| × | 3 | 12.5% |
| = | 1 | 4.2% |
| ∞ | 1 | 4.2% |
| − | 1 | 4.2% |
| → | 1 | 4.2% |
Other Number
| Value | Count | Frequency (%) |
| ½ | 12 | |
| ² | 3 | 15.8% |
| ³ | 2 | 10.5% |
| ⅓ | 1 | 5.3% |
| ⁴ | 1 | 5.3% |
Other Symbol
| Value | Count | Frequency (%) |
| ° | 3 | |
| ☆ | 2 | |
| ™ | 1 | 12.5% |
| ♡ | 1 | 12.5% |
| № | 1 | 12.5% |
Currency Symbol
| Value | Count | Frequency (%) |
| $ | 18 | |
| ¢ | 2 | 9.5% |
| £ | 1 | 4.8% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 971 | |
| – | 15 | 1.5% |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 82 | |
| ] | 5 | 5.7% |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 80 | |
| [ | 5 | 5.9% |
Final Punctuation
| Value | Count | Frequency (%) |
| ’ | 37 | |
| ” | 1 | 2.6% |
Initial Punctuation
| Value | Count | Frequency (%) |
| “ | 1 | |
| ‘ | 1 |
Space Separator
| Value | Count | Frequency (%) |
| 91029 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 3 |
Format
| Value | Count | Frequency (%) |
| | 2 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 652335 | |
| Common | 106680 | 14.0% |
| Cyrillic | 361 | < 0.1% |
| Greek | 170 | < 0.1% |
| Arabic | 11 | < 0.1% |
| Katakana | 8 | < 0.1% |
| Han | 5 | < 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 76408 | 11.7% |
| a | 49056 | 7.5% |
| o | 45765 | 7.0% |
| n | 40931 | 6.3% |
| r | 40096 | 6.1% |
| i | 39859 | 6.1% |
| t | 36792 | 5.6% |
| s | 29591 | 4.5% |
| h | 28564 | 4.4% |
| l | 25992 | 4.0% |
| Other values (107) | 239281 |
Common
| Value | Count | Frequency (%) |
| 91029 | ||
| : | 3727 | 3.5% |
| ' | 2512 | 2.4% |
| . | 1604 | 1.5% |
| , | 1136 | 1.1% |
| - | 971 | 0.9% |
| 2 | 864 | 0.8% |
| 1 | 699 | 0.7% |
| ! | 648 | 0.6% |
| 0 | 619 | 0.6% |
| Other values (50) | 2871 | 2.7% |
Cyrillic
| Value | Count | Frequency (%) |
| е | 33 | 9.1% |
| о | 32 | 8.9% |
| а | 32 | 8.9% |
| н | 26 | 7.2% |
| и | 24 | 6.6% |
| р | 23 | 6.4% |
| к | 17 | 4.7% |
| в | 16 | 4.4% |
| с | 15 | 4.2% |
| л | 14 | 3.9% |
| Other values (38) | 129 |
Greek
| Value | Count | Frequency (%) |
| α | 20 | 11.8% |
| ι | 14 | 8.2% |
| ο | 14 | 8.2% |
| τ | 9 | 5.3% |
| λ | 8 | 4.7% |
| ρ | 8 | 4.7% |
| ά | 8 | 4.7% |
| ν | 7 | 4.1% |
| ε | 6 | 3.5% |
| π | 6 | 3.5% |
| Other values (32) | 70 |
Katakana
| Value | Count | Frequency (%) |
| タ | 1 | |
| ン | 1 | |
| ポ | 1 | |
| ィ | 1 | |
| テ | 1 | |
| ス | 1 | |
| ァ | 1 | |
| フ | 1 |
Arabic
| Value | Count | Frequency (%) |
| ی | 2 | |
| ک | 2 | |
| چ | 2 | |
| ه | 2 | |
| ا | 1 | |
| س | 1 | |
| ج | 1 |
Han
| Value | Count | Frequency (%) |
| 時 | 1 | |
| 狗 | 1 | |
| 空 | 1 | |
| 傳 | 1 | |
| 貓 | 1 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 757982 | |
| None | 1132 | 0.1% |
| Cyrillic | 361 | < 0.1% |
| Punctuation | 62 | < 0.1% |
| Arabic | 11 | < 0.1% |
| Katakana | 8 | < 0.1% |
| CJK | 5 | < 0.1% |
| Misc Symbols | 3 | < 0.1% |
| Letterlike Symbols | 2 | < 0.1% |
| Math Operators | 2 | < 0.1% |
| Other values (2) | 2 | < 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 91029 | 12.0% | |
| e | 76408 | 10.1% |
| a | 49056 | 6.5% |
| o | 45765 | 6.0% |
| n | 40931 | 5.4% |
| r | 40096 | 5.3% |
| i | 39859 | 5.3% |
| t | 36792 | 4.9% |
| s | 29591 | 3.9% |
| h | 28564 | 3.8% |
| Other values (76) | 279891 |
None
| Value | Count | Frequency (%) |
| é | 218 | |
| ä | 128 | 11.3% |
| ö | 56 | 4.9% |
| è | 54 | 4.8% |
| ô | 44 | 3.9% |
| ü | 39 | 3.4% |
| ó | 37 | 3.3% |
| á | 35 | 3.1% |
| ı | 35 | 3.1% |
| à | 33 | 2.9% |
| Other values (108) | 453 |
Punctuation
| Value | Count | Frequency (%) |
| ’ | 37 | |
| – | 15 | |
| … | 5 | 8.1% |
| | 2 | 3.2% |
| “ | 1 | 1.6% |
| ‘ | 1 | 1.6% |
| ” | 1 | 1.6% |
Cyrillic
| Value | Count | Frequency (%) |
| е | 33 | 9.1% |
| о | 32 | 8.9% |
| а | 32 | 8.9% |
| н | 26 | 7.2% |
| и | 24 | 6.6% |
| р | 23 | 6.4% |
| к | 17 | 4.7% |
| в | 16 | 4.4% |
| с | 15 | 4.2% |
| л | 14 | 3.9% |
| Other values (38) | 129 |
Arabic
| Value | Count | Frequency (%) |
| ی | 2 | |
| ک | 2 | |
| چ | 2 | |
| ه | 2 | |
| ا | 1 | |
| س | 1 | |
| ج | 1 |
Misc Symbols
| Value | Count | Frequency (%) |
| ☆ | 2 | |
| ♡ | 1 |
CJK
| Value | Count | Frequency (%) |
| 時 | 1 | |
| 狗 | 1 | |
| 空 | 1 | |
| 傳 | 1 | |
| 貓 | 1 |
Number Forms
| Value | Count | Frequency (%) |
| ⅓ | 1 |
Letterlike Symbols
| Value | Count | Frequency (%) |
| ™ | 1 | |
| № | 1 |
Katakana
| Value | Count | Frequency (%) |
| タ | 1 | |
| ン | 1 | |
| ポ | 1 | |
| ィ | 1 | |
| テ | 1 | |
| ス | 1 | |
| ァ | 1 | |
| フ | 1 |
Math Operators
| Value | Count | Frequency (%) |
| ∞ | 1 | |
| − | 1 |
Arrows
| Value | Count | Frequency (%) |
| → | 1 |
video
Boolean
IMBALANCE 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 6 |
| Missing (%) | < 0.1% |
| Memory size | 1.6 MiB |
| False | |
|---|---|
| True | 93 |
| (Missing) | 6 |
| Value | Count | Frequency (%) |
| False | 45367 | |
| True | 93 | 0.2% |
| (Missing) | 6 | < 0.1% |
vote_average
Real number (ℝ)
ZEROS 
| Distinct | 92 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 6 |
| Missing (%) | < 0.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.618207215 |
| Minimum | 0 |
|---|---|
| Maximum | 10 |
| Zeros | 2998 |
| Zeros (%) | 6.6% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 355.3 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 5 |
| median | 6 |
| Q3 | 6.8 |
| 95-th percentile | 7.8 |
| Maximum | 10 |
| Range | 10 |
| Interquartile range (IQR) | 1.8 |
Descriptive statistics
| Standard deviation | 1.924215992 |
|---|---|
| Coefficient of variation (CV) | 0.3424964438 |
| Kurtosis | 2.500402198 |
| Mean | 5.618207215 |
| Median Absolute Deviation (MAD) | 0.9 |
| Skewness | -1.518990058 |
| Sum | 255403.7 |
| Variance | 3.702607182 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 2998 | 6.6% |
| 6 | 2468 | 5.4% |
| 5 | 2001 | 4.4% |
| 7 | 1886 | 4.1% |
| 6.5 | 1722 | 3.8% |
| 6.3 | 1603 | 3.5% |
| 5.5 | 1381 | 3.0% |
| 5.8 | 1369 | 3.0% |
| 6.4 | 1350 | 3.0% |
| 6.7 | 1342 | 3.0% |
| Other values (82) | 27340 |
| Value | Count | Frequency (%) |
| 0 | 2998 | |
| 0.5 | 13 | < 0.1% |
| 0.7 | 1 | < 0.1% |
| 1 | 105 | 0.2% |
| 1.1 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 10 | 190 | |
| 9.8 | 1 | < 0.1% |
| 9.6 | 1 | < 0.1% |
| 9.5 | 18 | < 0.1% |
| 9.4 | 3 | < 0.1% |
vote_count
Real number (ℝ)
ZEROS 
| Distinct | 1820 |
|---|---|
| Distinct (%) | 4.0% |
| Missing | 6 |
| Missing (%) | < 0.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 109.8973383 |
| Minimum | 0 |
|---|---|
| Maximum | 14075 |
| Zeros | 2899 |
| Zeros (%) | 6.4% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 355.3 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 3 |
| median | 10 |
| Q3 | 34 |
| 95-th percentile | 434 |
| Maximum | 14075 |
| Range | 14075 |
| Interquartile range (IQR) | 31 |
Descriptive statistics
| Standard deviation | 491.3103739 |
|---|---|
| Coefficient of variation (CV) | 4.470630331 |
| Kurtosis | 151.2028027 |
| Mean | 109.8973383 |
| Median Absolute Deviation (MAD) | 8 |
| Skewness | 10.45023206 |
| Sum | 4995933 |
| Variance | 241385.8835 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 3264 | 7.2% |
| 2 | 3132 | 6.9% |
| 0 | 2899 | 6.4% |
| 3 | 2787 | 6.1% |
| 4 | 2480 | 5.5% |
| 5 | 2097 | 4.6% |
| 6 | 1747 | 3.8% |
| 7 | 1570 | 3.5% |
| 8 | 1359 | 3.0% |
| 9 | 1194 | 2.6% |
| Other values (1810) | 22931 |
| Value | Count | Frequency (%) |
| 0 | 2899 | |
| 1 | 3264 | |
| 2 | 3132 | |
| 3 | 2787 | |
| 4 | 2480 |
| Value | Count | Frequency (%) |
| 14075 | 1 | |
| 12269 | 1 | |
| 12114 | 1 | |
| 12000 | 1 | |
| 11444 | 1 |